Mark Needham

Thoughts on Software Development

Coding: The primitive obsession

with 5 comments

I recently came across an interesting post by Naresh Jain where he details a discussion at the SDTConf 2009 about the code smells that hurt people the most.

Naresh describes the ‘primitive obsession’ anti pattern as being the crux of poor design:

I would argue that I’ve seen code which does not have much duplication but its very difficult to understand what’s going on. Hence I claim, “only if the code had better abstractions it would be a lot easier to understand and evolve the code”. Also when you try to eliminate duplicate code, at one level, there is no literal code duplication, but there is conceptual duplication and creating a high order abstraction is an effective way to solve the problem. Hence I conclude that looking back, Primitive Obsession is at the crux of poor design. a.k.a Biggest Stinker.

When I first read this post I thought Naresh was only talking about primitives such as strings, integers, decimals and so on but after discussing this pattern at today’s book club Tom pointed out that the problem might also be defined as ‘describing high level domain concepts with low level types‘.

I’ve noticed that this pattern in code often seems to happen when we have abstractions in our code which are very generic and don’t express the domain very clearly.

I think the reason people code in this way comes from a desire to minimise the amount of code being written – as I understand it the theory is that if we write less code then it will be more maintainable.

I really like the way Dave Cameron describes the disadvantages of doing this in a comment on Naresh’s post:

Naresh, I agree strongly with your assertion that we need *better* abstractions. Eliminating duplication by introduction a mediocre abstraction leads to pain. Removing duplication and parceling it up in an awkward way can make it more difficult to see a more elegant abstraction later. The short-term benefit of eliminating duplication leads to a long-term pain of awkward design.

The other motivator behind this style of coding is to try and create classes which are really reusable.

Udi Dahan explains why this often doesn’t work but the problem that I see from this attempt at writing reusable code is that it will get reused in places where the concept doesn’t quite make sense.

As a result of this we end up writing a lot of very hacky code to try and make it fit.

The most frequent offender seems to be the creation of collections of things which we then query to find specific items elsewhere in the code. More often that not we’ll also perform some custom logic on that particular item.

At its most extreme we might extract each of the items in a collection and make use of them separately. This somewhat defeats the purpose of the collection.

When this happens it often seems to me that what we actually have is a series of attributes of an object rather than a collection.

While discussing why this type of coding happens it was pointed out that it’s much easier to make use of the primitives than to spend the time looking for a helpful abstraction – I think this is an example of bad laziness.

We might be spending less effort now writing the code but it will be more difficult for anyone else who reads it in future which is not a good trade off.

Another similar pattern I’ve seen quite frequently is where we extend one of the classes in the collection API with our ‘domain concept‘.

While this seems better than just passing around a collection the problem is that any clients of our class have access to any of the methods of the collections API we extend which provides more flexibility than we would like and can lead to our object being in an unexpected state if those methods get used.

In summary

I think the way to solve this problem comes down to spending more time thinking about our domain and the best way to model it and discussing this regularly with other people on the team.

That way I believe our knowledge of how the system actually works will be more explicit in the code which will make it easier for people to understand the code base.

Be Sociable, Share!

Written by Mark Needham

October 23rd, 2009 at 12:08 am

Posted in Coding

Tagged with

  • Pingback: Tweets that mention Coding: The primitive obsession at Mark Needham -- Topsy.com()

  • I think it will probably even be more difficult for the original authour to read it in the future. They will need to remember there own understanding of the problem domain. A few months or years later, that will likely not be so easy.

  • This is why getting checkstyle to fail a build due to duplicate lines of code is a bad thing. It forces you to create the mediocre abstraction for the sake of making the build pass. It would be better to carry the tech debt and apply a more elegant solution when a pattern emerges for reuse.

  • ‘I think the way to solve this problem comes down to spending more time thinking about our domain and the best way to model it and discussing this regularly with other people on the team.’

    For me this is the biggy!

    If people spent more time thinking about their domain model then their understanding of their abstractions would be better and therefore reducing primitive obsession – I also see people getting scared of reviewing their abstractions incase they are not suitable, another instance of non agile behaviour, we’re all guilty of this 😉

    As for collections – like all state in a domain I believe it should be immutable, if you want to affect a change talk to the domain entity containing the collection…

  • Pingback: Wrapping about Collections()