Mark Needham

Thoughts on Software Development

Book Club: Versioning your database (K. Scott Allen)

with 5 comments

In our latest technical book club we discussed a series of posts written by K. Scott Allen about getting your database under version control.

These are some of my thoughts and our discussion:

  • We had an interesting discussion around when it’s ok to go and change checked in change scripts – on previous projects I’ve worked on we’ve actually had the rule that once you’ve checked in a change script to source control then you can no longer change it but instead need to add another change script that does what you want.

    Several colleagues pointed out that this approach can lead to us having quite a lot of fine grained change scripts – each maybe adding or altering one column – and that an approach they had used with some success was to allow changes to be made up until a change script had been applied in an environment outside of the team’s control, typically UAT.

  • On a previous project I worked on where we didn’t script the database, we made use of Hibernate annotations to specify the mappings between domain objects and the database so the SQL code for database creation was being auto generated by a Hibernate tool.

    This doesn’t seem to fit in with the idea of allowing us to incrementally change our database but Tom pointed out that it might be quite a useful way to get an initial baseline for a database. After that we probably want to make use of change scripts.

  • We discussed the coupling between the model of a system that we have in a database and the one that exists in our code when using ORM tools and how we often end up with these two different models being quite tightly coupled meaning that changing our domain model can become quite difficult.

    Tom pointed out that the database is just another data source but that it often seems to be treated differently to other data sources in this respect.

    He also suggested the idea of creating an anti corruption layer in our code between the database and our domain model if they start to become too different rather than trying to keep them as similar as possible by some imaginative use of ORM mappings.

  • Another interesting are of discussion was around how to deal with test data and existing data in the system with respect to our change scripts.

    On projects I’ve worked on if we had reference data that was required across all environments if we wanted to make changes to this data then we would just make use of another change script to do that.

    Ahrum suggested that if we had environment specific test data then we’d probably have other environment specific scripts that we could run after the change scripts had been executed.

    One of the more interesting problems when making changes to tables which already have data in is working out what we want to do with the existing data if we change the type of a column.

    We can typically either create a temporary column and copy all the values there first before creating an update script that converts each of the values to the data type of the new column or we can just forget about the data which is a less acceptable options from what I’ve seen.

    The effort involved in doing this often seems to mean that we are more reluctant to make changes to column types after their initial creation which means that we need to choose the type quite accurately early on.

  • A couple of colleagues pointed out that one benefit they had found with taking this approach to working with databases actually helped to expose any problems in the process – there can’t be any under the table changes made to databases if they are being manipulated by change scripts otherwise we end up with different setups in different environments.

    Since we will be recreating and updating databases quite frequently these problems will become obvious quite quickly.

  • Dean pointed out that the change script approach is actually really useful for performance testing – a benefit of this approach that I hadn’t considered previously considered.

    When doing this type of testing we know which version of the database we were using at the time and if there suddenly becomes a performance problem then it should be much easier to track down which change resulted in the problem.

Be Sociable, Share!

Written by Mark Needham

September 24th, 2009 at 7:35 am

Posted in Book Club

Tagged with

  • http://www.tewari.info ash

    A free tool to implement K. Scott Allen’s approach can be downloaded from here : http://www.tewari.info/dbupdater

  • Wes McClure

    Great blog post, I like the mentions of flushing out problems early. We verify our change scripts in our CI environment, so I can make a deploy to UAT at any point a client requests it.

    We also, automatically build an “alpha” environment between dev and UAT that can be tested by managers internally so they don’t have to checkout and build code and/or update their database to review functionality.

    The best part though, is the automated deploy of change scripts to produciton. Now I have an installer for code and an update batch file/power shell script to update the production database! Manual deploys only take a few minutes to perform, and could eventually be scheduled to be automated. All of this thanks to change scripts!

  • Scott

    I’m glad my posts could spark an interesting discussion.
    I only wish I could have been there to participate!

    I still go with the rule that once a change script is checked in, it is published and can never change (unless there was something really, really wrong). Otherwise it creates confusion (do I really have the latest schema?).

    One thing we’ve tried to do during those times when the schema is still in flux is try to batch things up by nominating one person as the person to collect changes, verify them, and check in a change script. It’s not a great job to have, but these periods usually don’t last very long and it avoids the problem of too many fine grained change scripts.

    My other $0.02 :)

  • Simon Brunning

    See DbDeploy for another tool to help implement this pattern. I’ve used it on a couple of projects – works a treat.

  • http://www.markhneedham.com Mark Needham

    @Simon – yeah we used DbDeploy.NET on a project I worked on previously and that seemed to do the job quite nicely.

    @Scott – I like that approach as well. I guess that role you describe could work out quite nicely for countering the problem of fine grained changes too. Neat.