Mark Needham

Thoughts on Software Development

Micro Services: The curse of code ‘duplication’

with 8 comments

A common approach we’ve been taking on some of the applications I’ve worked on recently is to decompose the system we’re building into smaller micro services which are independently deployable and communicate with each other over HTTP.

An advantage of decomposing systems like that is that we could have separate teams working on each service and then make use of a consumer driven contract as a way of ensuring the contract between them is correct.

Often what actually happens is that we have one team working on all of the services which can lead to the a mentality where we treat start treating the comunication between services as if it’s happening in process.

One of the earliest lessons I learnt when writing code is that you should avoid repeating yourself in code – if you have two identical bits of code then look to extract that into a method somewhere and then call it from both locations.

This lesson often ends up getting applied across micro service boundaries when we have the same team working on both sides.

For example if we have a customer that we’re sending between two services then in Java land we might create a CustomerDTO in both services to marshall JSON to/from the wire.

We now have two versions of the ‘same’ object although that isn’t necessarily the case because the client might not actually care about some of the fields that get sent because its definition of a customer is different than the provider’s.

Nevertheless if we’re used to being able to working with tools like IntelliJ which let us make a change and see it reflected everywhere we’ll end up driving towards a design where the CustomerDTO is shared between the services.

This can be done via a JAR dependency or using something like git sub modules but in both cases we’ve now coupled the two services on their shared message format.

I think the ‘duplication’ thing might be less of an issue if you’re using a language like Clojure where you could work with maps/lists without transforming them into objects but I haven’t built anything web related with Clojure so I’m not sure.

As I understand it when we go down the micro services route we’re trading off the ease of working with everything in one process/solution for the benefits of being able to deploy, scale and maintain parts of it independently.

Perhaps the problem I’ve described is just about getting used to this trade off rather than holding onto the idea that we can still treat it as a single application.

I’d be curious to hear others’ experiences but I’ve seen this same pattern happen two or three times so I imagine it may well be common.

Be Sociable, Share!

Written by Mark Needham

November 28th, 2012 at 8:11 am

Posted in Micro Services

Tagged with

  • http://twitter.com/ctford Chris Ford

    Be veeeery careful here. If the shared code becomes an implicit part of the contract you can’t upgrade one half of the service without the other and things get very sticky.

  • Anonymous

    I don’t think you necessarily have to regard it as a tradeoff. I think you need to understand and treat duplication in a different way when it is separated by two application boundaries.

    Duplication can be a smell I agree but that doesn’t mean you need to solve duplication in the same way as you would within the boundary of a single application – that is where you extract common code. 

    I think the places where you see such duplication are going to tell you a lot about your architecture and the way things fit together and could even identify further coupling problems down the line – your need to remove duplication by extracting common code could suggest that the client+server share too much, or multiple clients are consuming the same fields on your object, the server is too chatty, the client is too demanding. Or it could not suggest there is a problem at all, it might just be an area you watch out for later on to make sure there are no problems down the line!

    To address your specific issue: perhaps rather than have a common DTO class with everything on (agree with Chris re. basically gluing your apps together at the service boundary by doing this) why not find frameworks/tools to make creating individual DTOs in each of your services easier. In short, write some shit hot Json parsing utilities that reduce the amount of grunt work in your DTOs and share that!  Of course, Clojure or any dynamic language is going to make that sort of thing easier in some ways, but it’s nothing that could not be done with a bit of thought in a more type-pernickety language.

  • David Turner

    In counterpoint to Chris Ford’s comment, I think it’s quite painful trying to keep two basically-identical DTO layers in
    sync. The tests that we ended up writing to support this case felt quite
    silly when you’ve no good reason to have separated the DTOs.

    I’m not sure it’s a bad thing to couple a number of services to a library for their shared message format – communicating services must agree on their message formats so there’s some kind of coupling going on there anyway even if it’s not expressed in the source code. Often the coupling is given in some prose documentation describing the contract, particularly when it’s for third-party consumption, which also has to be maintained alongside the DTO code.

    The library is also where we put code related to versioning the message formats, which makes upgrades pretty straightforward.

    We see sharing DTO code as the lower-risk option, as decoupling the two sides of the contract is a relatively easy change that can wait until it needs to be done. It gets harder if you let any service-specific logic creep into the DTOs before they split, so you have to put a bit of effort into stopping that from happening.

    This may be an instance of Conway’s Law: “organizations which design systems … are constrained to produce
    designs which are copies of the communication structures of these
    organizations” from http://en.wikipedia.org/wiki/Conway%27s_law – this predicts that if you have a tightly-coupled team of developers working on a bunch of microservices then you will likely end up with a tightly-coupled bunch of microservices.

  • http://www.markhneedham.com/blog Mark Needham

    Yes that’s exactly what we have effectively! It’s not the greatest solution.
    From what I can tell the best way to design systems like this is to actually not share anything in this way and just accept some duplication.

  • http://twitter.com/martinfowler Martin Fowler

    There’s no need to share DTOs on both sides of the service, indeed I’d argue that you would usually would not want to do that. If the services are supposed separate code bases then you should have separate DTOs. (This is different to the case where you are building a single code base with comms between nodes, then shared DTOs are ok). 

    To make it easier to build DTOs, a useful approach is to use a hash as the internal storage for the DTO and then use a marshaling class to marshall the data into the data transport format. That marshaling class is then a common utility (and would likely be shared). This way all your DTO does is map from an explicit interface (methods) to the hash. This gives you the advantages of a map/list representation (such as clojure does) but also the benefits of an explicit interface.

    The key thing about a DTO is that it encapsulates the data transport mechanism, rather than the data structure. Although in this case it also hides the decision to use hash/list structures internally rather than standard language fields.

  • Ilias Bartolini

    I had the same headache when working with James.
    I think the choice is based on the reason “why” we use microservices.

    If the main reason is to “scale” and there’s only one team and one group that decides when a new version on a given service should be deployed I think that sharing code could be a good idea that reduces maintainability cost.

    If you remove some of the above reasons (different teams working independently, services managed by different organisations, …) I would prefer having separate DTO and less shared code.

    Ilias

  • http://www.markhneedham.com/blog Mark Needham

    @twitter-20753504:disqus I tried to reply to you from my phone a couple of days ago but it clearly didn’t work!

    Anyway what I said was that I agree with you it does become implicit and you pretty much need all the services that rely on the common JAR/git submodule to all point to the same version of it for stuff to work. 

    If there were separate teams we’d be much less likely to think sharing code was a good solution but when  it’s within a team it does seem more attractive. 

    My general observations of the micro service approach is that you probably see more of the benefits of it over a long period of time but you pay a bit of a cost at the beginning to pay for that.

  • http://www.markhneedham.com/blog Mark Needham

    @8e8fb61263e782f3dc7dbf1013488e4b:disqus I like the idea of thinking why we are doing this. And yes with James was my first experience of this but it seems common! 

    In this context the team is actually building a 5/6 week prototype to show this way of building a system so the overhead of having to duplicate everything doesn’t seem so appealing because they have to move very quickly. Maybe if the system was built over and for a longer period of time it would be less of an issue.