Archive for the ‘Version Control’ Category
A couple of days ago I was curious when different versions of Neo4j had been released and although the release notes page was helpful I thought I’d find more detailed information if I looked up the git tags.
Assuming that we’ve already got a clone of the repository on our machine:
$ git clone firstname.lastname@example.org:neo4j/neo4j.git
We can pull down the latest tags by calling git fetch –tags or git fetch -t
$ git fetch -t remote: Counting objects: 542, done. remote: Compressing objects: 100% (231/231), done. remote: Total 287 (delta 247), reused 84 (delta 50) Receiving objects: 100% (287/287), 42.85 KiB, done. Resolving deltas: 100% (247/247), completed with 191 local objects. From github.com:neo4j/neo4j * [new tag] 1.9.2 -> 1.9.2 * [new tag] 1.9.5 -> 1.9.5 * [new tag] 2.0.0-M06 -> 2.0.0-M06
We can get a list of all the tags in the repository with the following command:
$ git tag | head -n5 1.3 1.4 1.4.1 1.4.2 1.4.M01
Now let’s have a look which commit that tag points at:
$ git show 1.3 tag 1.3 Tagger: Neo4j Build Server <email@example.com> Date: Tue Nov 20 17:03:38 2012 +0000 Tagging for release 1.3 commit ff16757dd53399eccb8f3db40eb48bab065459b0 Author: Neo Technology buildbox <firstname.lastname@example.org> Date: Tue Apr 12 22:03:33 2011 +0000
That command gets us the appropriate information but ideally we want to get the commit hash and the date on a single line which we can do by passing the ‘–format‘ flag to git log:
$ git log --format="%h %ad%n" 1.3 ff16757 Tue Apr 12 22:03:33 2011 +0000 9651aa8 Tue Apr 12 21:58:58 2011 +0000 21c637d Tue Apr 12 12:39:49 2011 +0200 4ed65eb Tue Apr 12 12:39:28 2011 +0200
We can pipe that to head to get the most recent commit:
$ git log --format="%h %ad%n" 1.3 | head -n1 ff16757 Tue Apr 12 22:03:33 2011 +0000
I tried to pipe the output of git tag to git log using xargs but I couldn’t get it to work so I resorted to a for loop instead:
$ for tag in `git tag`; do printf "%-20s %-100s \n" $tag "`git log --format="%h %ad%n" $tag | head -n1`"; done | head -n5 1.3 ff16757 Tue Apr 12 22:03:33 2011 +0000 1.4 5c19dc3 Fri Jul 8 16:22:37 2011 +0200 1.4.1 55f4ab2 Tue Aug 2 15:14:11 2011 +0300 1.4.2 cb85742 Tue Sep 27 18:59:13 2011 +0100 1.4.M01 f5aacf4 Fri Apr 29 10:12:52 2011 +0200
We could then pipe that output through grep to only show non point releases:
$ for tag in `git tag`; do printf "%-20s %-100s \n" $tag "`git log --format="%h %ad%n" $tag | head -n1`"; done | grep -E "^\d\.\d " 1.3 ff16757 Tue Apr 12 22:03:33 2011 +0000 1.4 5c19dc3 Fri Jul 8 16:22:37 2011 +0200 1.5 0225cb7 Thu Oct 20 03:51:06 2011 +0200 1.6 f6f3cc1 Sun Jan 22 15:02:04 2012 +0100 1.7 cc4ad98 Wed Apr 18 18:32:20 2012 +0200 1.8 084acc9 Tue Sep 25 09:47:04 2012 +0100 1.9 2efc04c Mon May 20 12:08:24 2013 +0100
I first played around with Neo4j in September 2011 and I now know that I was using version 1.4 at the time.
We’re now at 1.9.5 and the latest beta release is 2.0.0-M06 so there have been quite a few releases in between!
A few days ago I wrote a blog post describing how I wanted to squash a series of commits into one bigger one before making a pull request and in the comments Rob Hunter showed me an even easier way to do so.
To recap, by the end of the post I had the following git config:
$ cat .git/config [remote "origin"] fetch = +refs/heads/*:refs/remotes/origin/* url = email@example.com:mneedham/neo4j-shell-tools.git [branch "master"] remote = origin merge = refs/heads/master [remote "base"] url = firstname.lastname@example.org:jexp/neo4j-shell-tools.git fetch = +refs/heads/*:refs/remotes/base/* [branch "readme-pull"] remote = origin merge = refs/heads/readme-pull [branch "readme"] remote = origin merge = refs/heads/readme
I was working against the remote ‘origin’ but the actual home of this repository is ‘base’.
I’d created a load of commits on ‘origin/readme’ and had then squashed them all into one commit on ‘origin/readme-pull’ by using the following command:
$ git rebase -i c4e94f668223d53f6c7364d19aa965d09ea7eb00
where ‘c4e94f668223d53f6c7364d19aa965d09ea7eb00′ is the hash of the last commit that was made in ‘base/master’.
Rob suggested that I should try using upstream tracking to simplify this even further. When we use upstream tracking we create a link between a local and remote repository which in this case is useful for working out where our commits start from.
I thought I’d try it out on another branch. We want to set the new branch to track ‘base/master’ since that’s the one we eventually want to have our commit applied against.
We’ll start from the ‘readme’ branch which has the list of commits that we want to squash
$ git branch master * readme readme-pull
Now let’s create a new branch and then track it against ‘base/master’:
$ git checkout -b readme-pull-new Switched to a new branch 'readme-pull-new' $ git branch --set-upstream readme-pull-new base/master Branch readme-pull-new set up to track remote branch master from base.
Squashing all our commits is now as simple as running the following command:
$ git rebase -i
And then choosing ‘squash’ against all commits except for the first one which can stay as ‘pick’. We then need to edit the commit message into shape which mostly involves deleting the commit messages from the commits we’ve squashed in this instance.
Thanks to Rob for the tip!
My colleague Michael has been doing some work to make it easier for people to import data into neo4j and his latest attempt is neo4j-shell-tools which adds some additional commands to the neo4j-shell.
I wanted to send Michael a pull request on Github but first I needed to squash all my commits down into a single one.
I initially thought there might be a way that I could do that via Github but I couldn’t see how to do that and eventually came across a post on Steve Klabnik’s blog which explained what I needed to do.
This is what my .git/config looked like initially:
[remote "origin"] fetch = +refs/heads/*:refs/remotes/origin/* url = email@example.com:mneedham/neo4j-shell-tools.git [branch "master"] remote = origin merge = refs/heads/master [remote "base"] url = firstname.lastname@example.org:jexp/neo4j-shell-tools.git fetch = +refs/heads/*:refs/remotes/base/* [branch "readme"] remote = origin merge = refs/heads/readme
I had all my commits on the ‘readme’ branch but the easiest approach seemed to be to create another branch on which I could squash all my commit – I called that branch ‘readme-pull’:
$ git branch readme-pull $ git checkout readme-pull Switched to branch 'readme-pull'
I then synced myself with Michael’s repository:
$ git fetch base remote: Counting objects: 77, done. remote: Compressing objects: 100% (18/18), done. remote: Total 43 (delta 15), reused 40 (delta 12) Unpacking objects: 100% (43/43), done. From github.com:jexp/neo4j-shell-tools e81c431..c4e94f6 master -> base/master $ git rebase base/master First, rewinding head to replay your work on top of it...
I then had to handle any conflicts when applying my changes on top of Michael’s master repository and then I was in a position to squash all my commits!
We can use rebase in interactive mode to do this and I’ve always done so by counting back how many commits I want to squash, so in this case it was 35:
$ git rebase -i HEAD~35 pick 141d0ae updating readme with link pick 94f8f93 more updating pick 03de50b readme updates pick 4e60332 more updates pick 3447d50 simplifying pick d577520 tweaks pick 2d993d4 more pick f948582 list of commands pick 713aae8 updating
I later realised that I could have just passed in the last commit hash from the master to the rebase command i.e.
commit c4e94f668223d53f6c7364d19aa965d09ea7eb00 Author: Michael Hunger <email@example.com> Date: Fri Jul 12 10:33:55 2013 +0200 fixed test
$ git rebase -i c4e94f668223d53f6c7364d19aa965d09ea7eb00
I then set all but the first commit to ‘squash‘ and pushed to my repository:
$ git push -u origin readme-pull:readme-pull
Finally I issued my pull request and Michael merged it in!
Andres and I recently found ourselves wanting to delete a remote branch which had the same name as a tag and therefore the normal way of doing that wasn’t worked out as well as we’d hoped.
I created a dummy repository to recreate the state we’d got ourselves into:
$ echo "mark" > README $ git commit -am "readme" $ echo "for the branch" >> README $ git commit -am "for the branch" $ git checkout -b same Switched to a new branch 'same' $ git push origin same Counting objects: 5, done. Writing objects: 100% (3/3), 263 bytes, done. Total 3 (delta 0), reused 0 (delta 0) To ssh://firstname.lastname@example.org/markhneedham/branch-tag-test.git * [new branch] same -> same $ git checkout master $ echo "for the tag" >> README $ git commit -am "for the tag" $ git tag same $ git push origin refs/tags/same Counting objects: 5, done. Writing objects: 100% (3/3), 266 bytes, done. Total 3 (delta 0), reused 0 (delta 0) To ssh://email@example.com/markhneedham/branch-tag-test.git * [new tag] same -> same
We wanted to delete the remote ‘same’ branch and the following command would work if we hadn’t created a tag with the same name. Instead it throws an error:
$ git push origin :same error: dst refspec same matches more than one. error: failed to push some refs to 'ssh://firstname.lastname@example.org/markhneedham/branch-tag-test.git'
We learnt that what we needed to do was refer to the full path for the branch when trying to delete it remotely:
$ git push origin :refs/heads/same To ssh://email@example.com/markhneedham/branch-tag-test.git - [deleted] same
To delete the tag we could do the same thing:
$ git push origin :refs/tags/same remote: warning: Deleting a non-existent ref. To ssh://firstname.lastname@example.org/markhneedham/branch-tag-test.git - [deleted] same
Of course the tag and branch still exist locally:
$ ls -alh .git/refs/heads/ total 16 drwxr-xr-x 4 markhneedham wheel 136B 13 Jun 23:09 . drwxr-xr-x 5 markhneedham wheel 170B 13 Jun 22:39 .. -rw-r--r-- 1 markhneedham wheel 41B 13 Jun 23:08 master -rw-r--r-- 1 markhneedham wheel 41B 13 Jun 23:08 same $ ls -alh .git/refs/tags/ total 8 drwxr-xr-x 3 markhneedham wheel 102B 13 Jun 23:08 . drwxr-xr-x 5 markhneedham wheel 170B 13 Jun 22:39 .. -rw-r--r-- 1 markhneedham wheel 41B 13 Jun 23:08 same
So we got rid of them as well:
$ git checkout master Switched to branch 'master' $ git branch -d same Deleted branch same (was 08ad88c). $ git tag -d same Deleted tag 'same' (was 1187891)
And now they are gone:
$ ls -alh .git/refs/heads/ total 8 drwxr-xr-x 3 markhneedham wheel 102B 13 Jun 23:16 . drwxr-xr-x 5 markhneedham wheel 170B 13 Jun 22:39 .. -rw-r--r-- 1 markhneedham wheel 41B 13 Jun 23:08 master $ ls -alh .git/refs/tags/ total 0 drwxr-xr-x 2 markhneedham wheel 68B 13 Jun 23:16 . drwxr-xr-x 5 markhneedham wheel 170B 13 Jun 22:39 ..
Out of interest we’d ended up with this situation by mistake rather than by design but it was still fun to do a little bit of git digging to figure out how to solve the problem we’d created for ourselves.
I thought I’d then be able to just push the change using my Google user name and password but instead ended up with the following error:
➜ mhneedham-totally-lazy hg push pushing to https://email@example.com/r/mhneedham-totally-lazy/ searching for changes 1 changesets found http authorization required realm: Google Code hg Repository user: m.h.needham password: abort: HTTP Error 403: Forbidden
It turns out that you need to specifically set an option to use your Google account from the settings page:
And then it works!
Something that we want to do reasonable frequently on my current project is to push some changes which have been committed to our local repository to master but not all of them.
For example we might end up with 3 changes we haven’t pushed:
>> ~/github/local$ git status # On branch master # Your branch is ahead of 'origin/master' by 3 commits. # nothing to commit (working directory clean)
>> ~/github/local$ git hist * bb7b139 Thu, 20 Oct 2011 07:37:11 +0100 | mark: one last time (HEAD, master) [Mark Needham] * 1cef99a Thu, 20 Oct 2011 07:36:35 +0100 | mark:another new line [Mark Needham] * 850e105 Thu, 20 Oct 2011 07:36:01 +0100 | mark: new line [Mark Needham] * 2b25622 Thu, 20 Oct 2011 07:32:43 +0100 | mark: adding file for first time (origin/master) [Mark Needham]
And we only want to push the commit with hash 850e105 for example.
The approach which my colleague Uday showed us is to first take a temporary branch of the current state.
>> ~/github/local$ git checkout -b temp-branch Switched to a new branch 'temp-branch'
Then immediately switch back to master and ‘get rid’ of the last two changes from there:
>> ~/github/local$ git checkout master Switched to branch 'master' Your branch is ahead of 'origin/master' by 3 commits.
>> ~/github/local$ git reset HEAD~2 --hard HEAD is now at 850e105 mark: new line
We can then push just that change:
>> ~/github/local$ git push Counting objects: 5, done. Writing objects: 100% (3/3), 257 bytes, done. Total 3 (delta 0), reused 0 (delta 0) Unpacking objects: 100% (3/3), done. To /Users/mneedham/github/remote 2b25622..850e105 master -> master
And merge the temporary branch back in again so we’re back where we were before:
>> ~/github/local$ git merge temp-branch Updating 850e105..bb7b139 Fast-forward foo.txt | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-)
>> ~/github/local$ git hist * bb7b139 Thu, 20 Oct 2011 07:37:11 +0100 | mark: one last time (HEAD, temp-branch, master) [Mark Needham] * 1cef99a Thu, 20 Oct 2011 07:36:35 +0100 | mark:another new line [Mark Needham] * 850e105 Thu, 20 Oct 2011 07:36:01 +0100 | mark: new line (origin/master) [Mark Needham] * 2b25622 Thu, 20 Oct 2011 07:32:43 +0100 | mark: adding file for first time [Mark Needham]
>> ~/github/local$ git status # On branch master # Your branch is ahead of 'origin/master' by 2 commits. # nothing to commit (working directory clean)
And finally we delete the temporary branch:
>> ~/github/local$ git branch -d temp-branch Deleted branch temp-branch (was bb7b139).
We can achieve the same thing without creating the branch and just cherry picking the commits back again after we’ve pushed our changes but this seems approach seems quicker.
We recently wanted to get the Git history of a file which we knew existed but had now been deleted so we could find out what had happened to it.
Using a simple git log didn’t work:
git log deletedFile.txt fatal: ambiguous argument 'deletedFile.txt': unknown revision or path not in the working tree.
We eventually came across Francois Marier’s blog post which points out that you need to use the following command instead:
git log -- deletedFile.txt
I’ve tried reading through the man page but I’m still not entirely sure what the distinction between using – and not using it is supposed to be.
If someone could explain it that’d be cool…
We’ve had an xsbt branch on our gitolite powered repository for the last couple of weeks while we worked out how to move our build from sbt 0.7 to sbt 0.10 but having finally done that we needed to delete it.
I originally tried running the following command from one of our developer workstations:
git push origin :xsbt
But ended up with the following error:
remote: error: denying ref deletion for regs/head/xsbt
! [remote rejected] xsbt (deletion prohibited)
A bit of googling led me to this stackoverflow thread which suggested that you needed to be an administrator in order to delete a remote branch.
Once we’ve done that we can run the following command on each machine to delete the remote tracking reference to the repository:
git branch -d -r origin/xsbt
One problem we’ve come across a few times over the last couple of months while using Mercurial is the situation where we want to quickly commit a local change without committing other local changes that we’ve made.
The example we came across today was where we wanted to make a change to the build file as we’d made a mistake in the target that runs on our continuous integration server and hadn’t noticed for a while during which time we’d accumulated other local changes.
The following is a rough diagram of the situation we had:
We had multiple file changes in our working directory which hadn’t yet been checked in to the local repository or the central repository.
We wanted to push just the change in blue.
My initial thought was that I could check in just that one file into our local repository and then push it to the central one.
hg ci -m "mark: updating build file to fix build" -A /path/to/build.file
I then wanted to push that change but when I went to do so I realised that they were other incoming changes which we hadn’t yet integrated with.
In order to integrate with those changes we need to make sure that we don’t have any locally uncommitted changes which of course in this scenario we do since we deliberately chose not to check in some of our local changes.
One way around this would be to just force the push and ignore the need to integrate with the remote changes but that doesn’t seem the right approach to me but I’m not sure what is.
We ended up just checking in everything we had locally, commenting out the bits that we were currently working on, merging with the remote changes and then pushing everything to the remote repository.
That’s obviously a really poor way of solving the problem so I’d be interested in what a good way to solve this problem would be!
I was reading a recent blog post by Gabriel Schenker where he discusses how his team is making use of Git and about half way through he says the following:
When using Git as your SCM it is normal to work for quite a while – maybe for a couple of days – in a local branch and without ever pushing the changes to the origin. Usually we only push when a feature is done or a defect is completely resolved.
We’ve been using Mercurial on the project I’m currently working on over the past few months and although it’s a similar tool we’ve been following a different approach.
We’ve got it setup the same way we would setup Subversion:
We’ve been trying to push to the central repository as frequently as possible, just as we would if we were using Subversion.
I don’t know the Git workflow that well because I haven’t used it on a project yet but we’ve always found that it’s beneficial to integrate with code being written by others on the team as frequently as possible.
Not doing this can lead to the problems which Martin Fowler outlines in his post about feature branches.
We’ve tried to ensure that after every commit the build still passes although we do sometimes have broken versions in the code committed locally because we don’t run our full test suite before every local check in.
Even if a feature isn’t completed I still think it’s valuable to have what we’ve done so far checked in and it also helps remove the problem with needing to backup local repositories:
Since we are going to work locally potentially for days without pushing to the origin (our central repository) we might well loose our work if we have a hard disk crash or our office is flooded. Thus we need some backup strategy.
We just need to make sure the central repository is being backed up and then the danger of losing our work is significantly reduced.