Mark Needham

Thoughts on Software Development

Archive for September, 2008

Connecting to LDAP server using OpenDS in Java

with 4 comments

A colleague and I have spent the past couple of days spiking solutions for connecting to LDAP servers from Ruby.

We decided that the easiest way to do this is by using OpenDS, an open source directory service based on LDAP.

One option we came up with for doing this was to make use of the Java libraries for connecting to the LDAP server and then calling through to these from our Ruby code using the Ruby Java Bridge.

This post is not about Ruby, but about how we did it in Java to check that the idea was actually feasible.

The interfaces and classes we need to use to do this are not very obvious so it was a little bit fiddly getting it to work. The following code seems to do the trick though:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
import org.opends.server.admin.client.ldap.JNDIDirContextAdaptor;
 
import javax.naming.directory.DirContext;
import javax.naming.NamingException;
import javax.naming.Context;
import javax.naming.ldap.LdapName;
import javax.naming.ldap.InitialLdapContext;
 
import com.sun.jndi.ldap.LdapCtx;
 
import java.util.Hashtable;
 
public class OpenDs {
 
    public static void main(String[] args) throws NamingException {
        DirContext dirContext = createLdapContext();
        JNDIDirContextAdaptor adaptor =  JNDIDirContextAdaptor.adapt(dirContext);
 
        // do other stuff with the adaptor
    }
 
    private static DirContext createLdapContext() throws NamingException {
        Hashtable env = new Hashtable();
        env.put(Context.INITIAL_CONTEXT_FACTORY, "com.sun.jndi.ldap.LdapCtxFactory");
        env.put(Context.PROVIDER_URL, "ldap://localhost:389");
        env.put(Context.SECURITY_AUTHENTICATION, "simple");
        env.put(Context.SECURITY_PRINCIPAL, "cn=Directory Manager");
        env.put(Context.SECURITY_CREDENTIALS, "password");
 
        return new InitialLdapContext(env, null);
    }
}

Some points about the code:

  • Port 389 is the default port for the LDAP server so unless it’s in use this is probably the port you need to connect to.
  • ‘Directory Manager’ is the default ‘Root User DN’ that was setup when we installed OpenDS although there is more information on what this value may need to be on the official documentation.
  • We originally tried to connect using JNDIDirContextAdaptor.simpleBind(…) but it didn’t seem to work for us so we went with the JNDIDirContextAdaptor.adapt(…) approach.

Written by Mark Needham

September 29th, 2008 at 11:27 pm

Posted in Java

Tagged with , ,

Show pwd all the time

with 3 comments

Finally back in the world of the shell last week I was constantly typing ‘pwd’ to work out where exactly I was in the file system until my colleague pointed out that you can adjust your settings to get this to show up automatically for you on the left hand side of the prompt.

To do this you need to create or edit your .bash_profile file by entering the following command:

vi ~/.bash_profile

Then add the following line to this file:

export PS1='\u@\H \w\$ '

You should now see something like the following on your command prompt:

mneedham@Macintosh-5.local /users/mneedham/Erlang/playbox$

Another colleague pointed out that the information on the left side is completely configurable. The following entry from the manual pages of bash (Type ‘man bash’ then search for ‘PROMPTING’) show how to do this:

PROMPTING
       When executing interactively, bash displays the primary prompt PS1 when it is ready to read a command, and the secondary prompt PS2 when it needs more input to complete a command.  Bash allows these prompt
       strings to be customized by inserting a number of backslash-escaped special characters that are decoded as follows:
              \a     an ASCII bell character (07)
              \d     the date in "Weekday Month Date" format (e.g., "Tue May 26")
              \D{format}
                     the format is passed to strftime(3) and the result is inserted into the prompt string; an empty format results in a locale-specific time representation.  The braces are required
              \e     an ASCII escape character (033)
              \h     the hostname up to the first `.'
              \H     the hostname
              \j     the number of jobs currently managed by the shell
              \l     the basename of the shell's terminal device name
              \n     newline
              \r     carriage return
              \s     the name of the shell, the basename of $0 (the portion following the final slash)
              \t     the current time in 24-hour HH:MM:SS format
              \T     the current time in 12-hour HH:MM:SS format
              \@     the current time in 12-hour am/pm format
              \A     the current time in 24-hour HH:MM format
              \u     the username of the current user
              \v     the version of bash (e.g., 2.00)
              \V     the release of bash, version + patchelvel (e.g., 2.00.0)
              \w     the current working directory
              \W     the basename of the current working directory
              \!     the history number of this command
              \#     the command number of this command
              \$     if the effective UID is 0, a #, otherwise a $
              \nnn   the character corresponding to the octal number nnn
              \\     a backslash
              \[     begin a sequence of non-printing characters, which could be used to embed a terminal control sequence into the prompt
              \]     end a sequence of non-printing characters

This page has more information on some of the other files that come in useful when shell scripting.

Written by Mark Needham

September 28th, 2008 at 10:50 pm

Posted in Shell Scripting

Tagged with ,

Pair Programming: What do we gain from it?

with 3 comments

My former colleague Vivek Vaid has an interesting post about parallel-paired programming where he talks about introducing lean concepts into deciding when we should pair to get maximum productivity.

Midway through the post he mentions that the original reason that we starting pairing was for ‘collaborative design’ which got me thinking whether there are reasons beyond this why we would want to pair.

I have often worked on clients where the value of pair programming has been questioned and it has been suggested that we should only adhere to this practice for tasks where it adds most value.

Clearly collaborative design is one of these, but there are some others too:

Faster Onboarding

The idea here is that someone who has been on the project for a great amount of time (possibly the Tech Lead) can help bring newer members of the team up to speed.

They can help the new team member to get their development environment up and running (although clearly making this as automated as possible is beneficial!) and help walk them through the code covering any questions they may have.

This role may also involve going though the reason that the team is there (i.e. what problem we are trying to solve), an overview of the way the problem is being solved and the technologies being used to do so, patterns being used in the code and any other information that is considered useful to the new team member.

My colleague Pat Kua covers this in more detail, referring to it as Student to Teacher in his series of posts on onboarding tips.

Using pairing in this context makes it much easier to get new team members up to speed, therefore improving their ability to be a productive part of the team.

Increasing Team Level

Pairing up senior and more junior members of a team is a very effective way to increase the level of the junior person.

More senior team members have a lot of knowledge which they can pass onto junior team members – as this knowledge comes from experience in the field it is not necessarily something that could be gained from reading a book.

On several of the projects I have worked on one of the more Senior members of the team has been the one who provided the onboarding for new team members.

Clearly there needs to be a balance with this approach because no matter how patient the senior person is, at some stage they are going to want to have the opportunity to work with someone closer to their level of ability.

Using pairing in this way helps to bring up the level of the less experienced members of the team and allows them to learn things that more senior members of the team probably take for granted.

Knowledge Sharing

This one is a harder one to sell because there are no doubt other ways of sharing knowledge on teams beyond just pairing.

However, I have seen it work successfully on projects I have worked on for spreading the knowledge of how different parts of the application work amongst the team. This is generally referred to as increasing the truck factor – no one member of the team should be invaluable.

I saw this as a benefit on a project I once worked on where I had spent the majority of our time pairing both with ThoughtWorks colleagues and client developers.

At the end of the project we had a meeting to discuss what handover needed to be done in order to allow the team to continue when we finished. We really struggled to find anything at all – the knowledge of how to do the vast majority of tasks was completely spread out amongst the team and no one person had knowledge that the rest of the team didn’t have.

This is almost like a side effect of pair programming but it definitely has some gains which should not be discounted.

Pair Programming vs Parallel Pair Programming

I’ve never used parallel pair programming, only pair programming which is used the majority of the time on projects I have worked on.

I would be interested in knowing whether we can still gain some of these other benefits from using parallel pair programming – it seems to me to be quite a streamlined version of pair programming, and I wonder whether we would still get the useful side effects of pairing if we don’t pair all the time?

Written by Mark Needham

September 28th, 2008 at 10:19 pm

Posted in Pair Programming

Tagged with , ,

Easily misused language features

without comments

In the comments of my previous post about my bad experiences with Java’s import static my colleague Carlos and several others pointed out that it is actually a useful feature when used properly.

The code base where I initially came across the feature misused it quite severely but it got me thinking about other language features I have come across which can add great value when used effectively but lead to horrific problems when misused.

Apart from import static some other features which I can see easy potential for misuse are:

  • C#’s automatic properties which seem to make it even easier to expose object’s internals than it already is but perhaps could be useful for something like Xml serialisation and deserialisation.
  • C#’s var keyword which
    helps to remove unnecessary type information and can lead to shorter methods when used well but completely unreadable code when misused.
  • Ruby’s open classes which give great flexibility when working effectively with 3rd party libraries for example but can lead to difficult debugging problems if overused. The same probably also applies to C#’s extension methods.

I have no doubt there are other features in other languages and probably more in the languages I listed but those are some of the ones that stood out to me when I first saw them.

I like this quote from Nate in the comments about import static, referring to being able to being able to tell what the code is doing just from looking at it:

The rule is simple (and somewhat subjective… if you aren’t sure, just ask me what I prefer. :) ) — if it helps the code to communicate better (especially contextually!) then use it.

Maybe a similar type of rule is applicable when using other language features as well and there is no feature that we should be using all the time – only if it makes our code easier to read and understand.

Written by Mark Needham

September 25th, 2008 at 11:18 pm

My dislike of Java’s static import

with 8 comments

While playing around with JBehave I was reminded of my dislike of the import static feature which was introduced in Java 1.5.

Using import static allows us to access static members defined in another class without referencing the class name. For example suppose we want to use the following method in our code:

Math.max(1,2);

Normally we would need to include the class name (Math) that the static function (max) belongs to. By using the import static we can reference max like so:

import static java.lang.Math.max;
...
max(1,2);

The benefit of this approach is that it makes the code read more fluently but the disadvantage is that you can’t immediately tell where a method lives. I want to be able to tell what is going on in the code from looking at it and anything which prevents this is a hindrance.

The official documentation even suggests using this functionality sparingly:

So when should you use static import? Very sparingly! Only use it when you’d otherwise be tempted to declare local copies of constants, or to abuse inheritance (the Constant Interface Antipattern). In other words, use it when you require frequent access to static members from one or two classes.

On my last project we ended up saying that import static was allowed in test code because there were relatively few places the static methods could be imported from, but when it came to production code the fully qualified path was required.

Written by Mark Needham

September 24th, 2008 at 11:59 pm

Posted in Java

Tagged with , ,

Onshore or Offshore – The concepts are the same?

with 3 comments

I’ve never worked on a distributed or offshore project before, but intrigued having read about Jay Fields’ experiences I attended the ‘OffShoring: The Current State of Play’ Quarterly Technology Briefing held this morning in Sydney to hear the other side of the argument.

The underlying message for me was that a lot of the concepts we apply for onshore projects are equally important for offshore projects.

Forrester’s Tim Sheedy started off by providing some research data on the state of IT offshoring, some reasons he had identified around which type of work should be offshored before closing on some reasons that it might fail if not done correctly.

My colleague Dharmarajan Sitaraman followed, speaking about ThoughtWorks’ experience in offshoring, again covering some of the things that can go wrong, the use of agile in offshore development and finally gave some recommendations on making it work.

What I learnt

  • Tim spoke about a tool Forrester have designed for working out whether your application should be offshored. I’m not sure exactly what the conclusion around this was but ThoughtWorks aim to do work traditionally considered ‘not safe’ for offshore development – i.e. enterprise business application development/product development.
  • The same problems such as communication with the business, inability to deliver business requirements, not adaptive to change were listed as reasons that offshore development can fail. From my experience these are the same reasons that we can fail on any project especially if we take an approach that doesn’t encourage quick feedback loops.
  • The advice given with regards to selecting an offshore vendor was to select them based on what they are good at. This seemed to be fairly sensible advice for selecting any partner. The benefit of taking an agile approach is clearly reduced if access to the business or final users is not possible for example.
  • Software as a service was a message both speakers considered. It was suggested that Australians don’t like to take risks with IT and therefore prefer a partner led approach. For me the first thing that came to mind when I heard this idea was that if software is a service then it should cover the whole life cycle of an application rather than just the development. I think the intention was more that IT should be considered as being more important to organisations than it currently is.
  • One of the questions from the audience was around how to price these engagements. A risk/reward model was suggested as being a good approach – the idea being to incentivise the partner to help the business to achieve its’ outcomes. The underlying message seemed to be that trust is necessary to achieve a successful outcome. It was also mentioned that this pricing model generally works better for longer (18-24 month) projects.
  • The idea that you get what you pay for was also brought up with regards to pricing. Although cost is clearly an important driver when deciding to offshore work it shouldn’t be taken to the absolute extreme.
  • Tim also spoke briefly about some of his ideas around the 21st century IT shop where he summed up the difference between the agile and waterfall approaches to software development as follows:

    Waterfall gives you what you ask for…agile gives you what you want

Overall

It was certainly interesting to see the view of software development from a different perspective and to see the data about how other organisations consider their relationship with IT vendors.

It would have been interesting to hear more about the distributed agile approach to development that Jay spoke about and whether/how this differs from complete offshoring but that wasn’t the focus of this talk.

Written by Mark Needham

September 24th, 2008 at 7:08 am

Testing with Joda Time

with 6 comments

The alternative to dealing with java.util.Date which I wrote about in a previous post is to make use of the Joda Time library. I’m led to believe that a lot of the ideas from Joda Time will in fact be in Java 7.

Nevertheless when testing with Joda Time there are times when it would be useful for us to have control over the time our code is using.

Why would we want to control the time?

There are a couple of situations that come to mind where it may be useful to be able to control the time in a system:

  • There is a piece of code which only executes at a certain time of the day. To see if it executes correctly we need to be able to set the system time to be that time.
  • Date calculations – we want to do a calculation on a date and verify the result. We therefore need to be able to control the original date.

Given that, there are two approaches which I have seen to allow us to do this:

Freezing time

Joda includes a DateTimeUtils class which allows us to change the current time.

On the projects I’ve worked on we would typically wrap these calls in a more descriptive class. For example:

1
2
3
4
5
6
7
8
9
10
11
12
13
import org.joda.time.DateTime;
import org.joda.time.DateTimeUtils;
 
public class JodaDateTime {
    public static void freeze(DateTime frozenDateTime) {
        DateTimeUtils.setCurrentMillisFixed(frozenDateTime.getMillis());
    }
 
    public static void unfreeze() {
        DateTimeUtils.setCurrentMillisSystem();
    }
 
}

This approach works better if DateTime is deeply engrained in the system and it is difficult for us to abstract dates behind another interface.

The benefit of taking this approach is that we can test for dates without having to change any of our code to add in another level of abstraction which leads to further complexity.

Time Provider

The alternative approach is to have a TimeProvider which we can pass around the system. This would typically be passed into the constructor of any classes which need to make use of time.

For example, we might have the following interface defined:

1
2
3
4
5
import org.joda.time.DateTime;
 
public interface TimeProvider {
    public DateTime getCurrentDateTime() ;
}

We can then mock out getCurrentDate() to return whatever date we want in our tests.

The advantage of this approach is that it allows more flexibility around the implementation – it could be used to sync system and local machine dates for example – although at a cost of adding extra complexity.

This approach is similar to the plugin pattern Martin Fowler details in Patterns of Enterprise Application Architecture in that we use one implementation of TimeProvider in our application and then a different version for testing.

I generally favour this approach if possible although if a quick win is needed then the first approach is fine.

Written by Mark Needham

September 24th, 2008 at 5:11 am

Posted in Java

Tagged with , , ,

Where are we now? Where do we want to be?

without comments

Listening to Dan North speaking last week I was reminded of one of my favourite NLP[*] techniques for making improvements on projects.

The technique is the TOTE (Test, Operate, Test, Exit) and it is a technique designed to help us get from where we are now to where we want to be via short feedback loops.

On my previous project we had a situation where we needed to build and deploy our application in order to show it to the client in a show case.

The first time we did this we did most of the process manually – it took three hours and even then still didn’t work properly. It was clear that we needed to do something about this. We needed the process to be automated.

Before we did this I mentally worked out what the difference was between where we were now and what the process would look like when we were at our desired state.

A full description of the technique can be found in the NLP Workbook but in summary, these are the steps for using the TOTE technique.

  1. What is our present state?
  2. What is our desired state?
  3. What specific steps and stages do we need to go through to get there?

We then execute an action and then re-compare the current state to the desired state until they are the same. Then we exit.

In our situation:

  1. To deploy our application we need to manually build it then copy the files to the show case machine.
  2. We want this process to happen automatically each time a change is made to the application

For step 3 to make sense more context is needed.

For our application to be ready to use we needed to build it and deploy it to the user’s desktop and build and deploy several other services to an application repository so that our application could stream them onto the desktop.

The small steps for achieving step 3 were:

  • Write a build for building the application and making the artifacts available
  • Write a script to deploy the application to the user’s desktop
  • Edit the build for building the services to make artifacts available
  • Write a script to deploy these services to the repository

This evolved slightly so that we could get Team City to simulate the process for our functional testing and then run a script to deploy it on the show case machine but the idea is the same.

There’s nothing mind blowing about this approach. It’s just a way of helping us to clarify what exactly it is we want to do and providing an easy way of getting there as quickly as possible.


* Sometimes when I mention NLP people get a bit defensive as it has been fed to them previously as a tool kit for solving all problems.

We need to remember that NLP is a set of communication techniques gathered from observing effective communicators and then recorded so that others could learn from this.

When used properly ideas from NLP can help us to clarify what we want to do and improve our ability to communicate with each other.

Written by Mark Needham

September 20th, 2008 at 5:32 pm

Similarities between Domain Driven Design & Object Oriented Programming

with 3 comments

At the Alt.NET UK Conference which I attended over the weekend it occurred to me while listening to some of the discussions on Domain Driven Design that a lot of the ideas in DDD are actually very similar to those being practiced in Object Oriented Programming and related best practices.

The similarities

Anaemic Domain Model/Law of Demeter

There was quite a bit of discussion in the session about anaemic domain models.

An anaemic domain model is one where a lot of the objects are merely data holders and do not actually have any behaviour inside them. While it has a fancy name, in OO terms this problem materialises due to our failure to adhere to the Law of Demeter.

My colleague Dan Manges has a brilliant post describing this principle but a tell tale sign is that if you see code like the following in your code base then you’re probably breaking it.

1
object.GetSomething().GetSomethingElse().GetSomethingElse()

This is often referred to as train wreck code and comes from breaking the idea of Tell Don’t Ask. In essence we should not be asking an object for its data and then performing operations on that data, we should be telling the object what we want it to do.

Side Effect Free Functions/Command Query Separation

DDD talks about side effect free functions which are described as follows:

An operation that computes and returns a result without observable side effects

The developer calling an operation must understand its implementation and the implementation of all its delegations in order to anticipate the result.

My colleague Kris Kemper talks about a very similar OOP best practice called command query separation. From Martin Fowler’s description:

The really valuable idea in this principle is that it’s extremely handy if you can clearly separate methods that change state from those that don’t. This is because you can use queries in many situations with much more confidence, introducing them anywhere, changing their order.

It’s not exactly the same but they have a shared intention – helping to make the code read more intuitively so that we can understand what it does without having to read all of the implementation details.

Intention Revealing Interfaces/Meaningful Naming

Intention Revealing Interfaces describe a similar concept to Side Effect Free Functions although they address it slightly differently:

A design in which the names of classes, methods, and other elements convey both the original developer’s purpose in creating them and their value to a client developer.

If a developer must consider the implementation of a component in order to use it, the value of encapsulation is lost.

In OOP this would be described as using meaningful names as detailed in Uncle Bob’s Clean Code (my review).

Bounded Context/Clean Boundaries

DDD’s bounded context describes “The delimited applicability of a particular model” i.e. the context in which is is held valid.

This is quite closely related to the idea of clean boundaries in Clean Code where Uncle Bob states:

Code at the boundaries needs clear separation and tests that define expectations

In both cases we are creating an explicit separation of ‘our code’ from the outside world so to speak. We want to clearly define where ‘our world’ ends by defining the interfaces with which we interact with the outside world.

Anti Corruption Layer/Wrappers

The anti corruption layer in DDD is “an isolating layer to provide clients with functionality in terms of their own domain model.”

It is used to create a boundary for our bounded context so that the models of other systems we interact with doesn’t creep into our system.

This is implemented in OO using one of the wrapper patterns. Examples of these are the Facade, Adapter, or Gateway pattern which all solve the problem in slightly different ways.

The intention in all cases is to have one area of our code which calls 3rd party libraries and shields the rest of the code from them.

Domain Driven Design = Object Oriented Programming + Ubiquitous Language?

While talking through some of these ideas I started to come to the conclusion that maybe the ideas that DDD describe are in fact very similar to those that OOP originally set out to describe.

The bit that DDD gives us which has perhaps been forgotten in OOP over time is describing the interactions in our systems in terms of the business problem which we are trying to solve i.e. the Ubiquitous Language.

From Wikipedia’s Object Oriented Programming entry:

OOP can be used to translate from real-world phenomena to program elements (and vice versa). OOP was even invented for the purpose of physical modeling in the Simula-67 programming language.

The second idea of physical modeling seems to have got lost somewhere along the way and we often end up with code that describes a problem at a very low level. Instead of describing a business process we describe the technical solution to it. You can be writing OO code and still not have your objects representing the terms that the business uses.

There are some things that DDD has certainly made clearer than OOP has managed. Certainly the first part of the book which talks about building a business driven Domain Model is something which we don’t pay enough attention to when using OOP.

For me personally before I read the concepts of DDD I would derive a model that I thought worked and then rarely go back and re-look at it to see if it was actually accurate. Reading DDD has made me aware that this is vital otherwise you eventually end up translating between what the code says and what the business says.

Ideas around maintaining model integrity are also an area I don’t think would necessarily be covered in OOP although some of the implementations use OOP ideas so they are not that dissimilar.

Why the dismissal of DDD?

The reason I decided to explore the similarities between these two concepts wasn’t to dismiss Domain Driven Design – I think the framework it has given us for describing good software design is very useful.

Clearly I have not mapped every single DDD concept to an equivalent in OOP. I think DDD has given a name or term to some things that we may just take for granted in OOP. Certainly the DDD ideas expressed around the design of our model are all good OOP techniques that may not be explicitly stated anywhere.

I wanted to point out these similarities as I feel it can help to reduce the fear of adopting a new concept if we know it has some things in common with what we already know – if a developer knows how to write OO code and knows design concepts very well then the likelihood is that the leap to DDD will not actually be that great.

It would be really good if we could get to the stage where when we teach the concepts of OOP we can do so in a way that emphasises that the objects we create should be closely linked to the business domain and are not just arbitrary choices made by the developers on the team.

Maybe the greatest thing about DDD is that it has brought all these ideas together in one place and made them more visible to practitioners.

I am very interested in how different things overlap, what we can learn from these intersections and what things they have in common. It’s not about the name of the concept for me, but learning what the best way to deliver software and then to maintain that software after it has been delivered.

Written by Mark Needham

September 20th, 2008 at 1:12 pm

Should we always use Domain Model?

with 8 comments

During the discussion about Domain Driven Design at the Alt.NET conference I felt like the idea of the Rich Domain Model was being represented as the only way to design software but I don’t feel that this is the case.

As always in software we never have a silver bullet and there are times when Domain Model is not necessarily the best choice, just as there are times when OOP is not necessarily the best choice.

To quote from Martin Fowler’s Patterns of Enterprise Application Architecture

It all comes down to the complexity of the behaviour in your system. If you have complicated and everychanging business rules involving validation, calculations, and derivations…you’ll want an object model.

What are the alternatives?

Domain Model is not a silver bullet and Martin suggests two alternatives when a model driven approach may not be the best choice

  1. Transaction Script – The best thing about this is its simplicity. It is easy to understand as all the logic is in one place and it is a good choice for applications with a small amount of logic.
  2. Table Module – This is a database driven approach with one class per table. If the system you’re working on is using a very table-orientated approach to storing data then this approach may be a good choice.

I think in order to make a Domain Model approach work, everyone in the team (including QAs,BAs etc) needs to buy into the idea and you need some people who have experience in using it so that you can use it in a pragmatic way.

While we have some great tools and techniques available to us in the world of software it is important to remember what problem we are trying to solve and pick the appropriate tool for the job.

*Updated*
I’ve edited the phrasing of this after conversation – I intended to refer to the Rich Domain Model concept used in Domain Driven Design and was presenting alternatives to this rather than to DDD as a whole.

Written by Mark Needham

September 19th, 2008 at 8:34 am

Posted in Domain Driven Design

Tagged with ,