Mark Needham

Thoughts on Software Development

Archive for the ‘Software Development’ Category

Oracle: exp – EXP-00008: ORACLE error 904 encountered/ORA-00904: “POLTYP”: invalid identifier

without comments

I spent a bit of time this afternoon trying to export an Oracle test database so that we could use it locally using the exp tool.

I had to connect to exp like this:

exp user/password@remote_address

And then filled in the other parameters interactively.

Unfortunately when I tried to actually export the specified tables I got the following error message:

EXP-00008: ORACLE error 904 encountered
ORA-00904: "POLTYP": invalid identifier
EXP-00000: Export terminated unsuccessfully

I eventually came across Oyvind Isene’s blog post which pointed out that you’d get this problem if you tried to export a 10g database using an 11g client which is exactly what I was trying to do!

He explains it like so:

The export command runs a query against a table called EXU9RLS in the SYS schema. On 11g this table was expanded with the column POLTYP and the export command (exp) expects to find this column.

I needed to download the 10g client so that I could use that version of exp instead. I haven’t quite got it working yet but at least it’s a different error to deal with!

Written by Mark Needham

January 13th, 2012 at 9:46 pm

My Software Development journey: 2011

without comments

A couple of years ago I used to write a blog post reflecting on what I’d worked on in the preceding year and what I’d learned and having read 2011 reviews by a couple of other people I thought I’d have a go.

Am I actually learning anything?

A thought I had many times in 2011 was ‘am I actually learning anything?‘ as, although I was working with languages that I hadn’t used professionally before, the applications that we I worked on were very similar to ones that I’ve worked on previously.

Often I’d work on something and know exactly how it should be designed and where we could go wrong since I’d done the same thing several times before and the challenge of not knowing what to do had disappeared somewhat.

Now and then…

I certainly failed to learn one thing a day as I suggested in a blog post a couple of years ago although eventually I managed to learn a bit about node.js and clojure by building some toy applications with my colleague Uday.

We decided to rewrite part of our Scala application in clojure in our own time to see what it’d look like which provided us with an interesting insight into what it’d be like to build a system for the second time when you know exactly what to do.

I also completed ml-class which was fun as it was the type of programming that I’ve never done before. Obviously I’m still a novice at the whole machine learning thing but it’s given me an idea of the sorts of things you can do.

Learning is doing

From February until April I was in Bangalore working as a trainer/coach for one of the ThoughtWorks University batches where we tried as much as possible to reduce the amount of ‘teaching’ done.

Sumeet has previously written about the new style of ThoughtWorks University which is more focused on people working on a real project than sitting in workshops and we tried to take this even further.

Previous groups had spent about 2 weeks doing workshop style sessions and then 4 weeks working on a project but we got it to the point where we spent just over a week in workshops and the rest working on the project.

In general I think it worked reasonably well and the skill level of the group seemed reasonably high by the end. We were lucky that there were only 13 people in the group – it would be interesting to see how our approach would scale.

I’ve also noticed this last year that when I’m learning something new it’s not enough to just do toy exercises anymore, I actually have to build something to retain interest.

During the Christmas holidays I decided to try and build a Flipboard style application for my Android phone so I can (yet again) capture the links that people post on twitter.

Actually having a real problem to solve has made me much more engaged than following a tutorial or hello world demo would have done.

Remembering the value of blogging

My rate of posting on here has decreased a lot over the last year which I think is partly down to the fact that I’ve written about a lot of the stuff I see on projects before but also because I started filtering what I thought was interesting enough to write about.

In hindsight the latter approach doesn’t necessarily make sense – the most read posts on this blog are the ones which I thought were the most pointless when I wrote them.

I got stuck in the mindsight that I wasn’t actually learning anything by writing blog posts, which has been proved wrong multiple times both in terms of what I learn in writing the post and from what I learn from people’s comments.

Expressing opinions in big groups/public

I spent 10 months in late 2010/early 2011 working in India and one of the most interesting things I remember observing was that people seemed very reluctant to express their opinion in big groups.

I thought that was something specific to India but on coming back to the UK I’ve noticed the same thing here as well which means we need to adjust our approach in retrospectives if we want everyone to participate.

I also learnt that expressing strong opinions in public in isn’t necessarily the most effective way of making change happen. I probably should have learnt this already but it became increasingly evident how ineffective this approach was in 2011.

Going at my own pace

A couple of years ago I was advised by a couple of colleagues that the way to get to the ‘next level’ was to become more knowledgeable about the overall architectural design of systems but at the time I wasn’t that interested in that.

It’s only more recently that I’ve found it interesting to read about different architectures on High Scalability or Systems We Make.

Another interesting way for me to learn in this area is to try and understand the architectures used in other ThoughtWorks projects that I didn’t work on and see how they compare to the ones I’ve worked on.

I generally can’t force myself to be interested in something if I’m not but once I am interested then I want to learn every detail about it so it’s better to wait until I become interested naturally.

The next thing which I’m sure I’ll eventually become interested in is tech leading a team which several of my peers (in terms of years of experience) are doing now or have been doing for a year or two. Right now though I want to focus on coding!

Overall…

I’m not sure 2011 was a year where I learned as much as I did in previous years – the learning did seem to taper off a bit which in a way is inevitable unless you completely change your role/the types of things you’re building.

In 2012 I plan to keep learning about Android development and I’m going to be doing algo-class to try and get better at another aspect of programming which I’m not very good at right now.

Written by Mark Needham

January 3rd, 2012 at 1:48 am

Yak Shaving: Tracking the yak stack

with 4 comments

Yak stack

While I’ve been learning how to write an android application there’s been plenty of opportunities for me to go off shaving yaks, it’s pretty much Yakville Central.

Typically I’d end up spending hours trying to work out some obscure thing which I didn’t really need to know so I wanted to try and avoid that this time.

I started keeping a track of the ‘yak stack’ which I was currently following and mentally noting exactly where I was up to.

An example of a yak stack I kept while trying to authorise a user of the app with Twitter using OAuth is shown in the photo on the right hand side.

It ended up looking like this:

  • Get the home timeline working
    • OAuth blowing up
      • Not actually capturing redirect back to app
        • Launch mode in Android manifest

Once I realise I’m heading down the stack I’ve been giving myself one pomodoro to try and dig myself one level up.

If I still haven’t managed to solve the problem I might keep going for one more pomodoro or just find another way around the problem.

I’m sure I’ll come across problems where I need to spend more than an hour trying to solve it but for now it’s working ok as a rule of thumb.

It’s definitely fun chasing yaks but I get to the end of the day and haven’t really achieved anything which isn’t fun.

Written by Mark Needham

December 31st, 2011 at 3:54 am

Posted in Software Development

Tagged with

The supposed black box

without comments

On a reasonable number of the systems that I’ve worked on over the past few years there’s been a ‘black box’ component which the team I’ve been on has needed to integrate with.

I’ve always found it a little strange that you wouldn’t need to/want to know how that part of the system worked or that you could actually believe that it was truly a black box.

If it doesn’t work then you have no way of diagnosing the problem – did you do something wrong, was there something wrong inside the black box or was it something else.

On a project I worked on a few years ago the reason for the black box thinking was that each layer was being developed by people from a different company.

The problem we had was that we were working on the top layer, the one that was visible to the end user and therefore our progress was very visible to the stakeholders who were paying for the product to be built.

We therefore had no choice but to go into the metaphorical black box and try and gather as much information as we could to pass on to the teams working on the other layers so that they would be able to help us better.

I recently watched a talk by Artur Bergman titled ‘Full Stack Awareness‘ where he talks about the necessity of understanding exactly what is happening when our code gets executed rather than thinking of it as magic.

Although Artur is working in a different context to most application developers who maybe don’t need to know the stack as well as he does I think the advice about treating something as magic is useful.

If we think of something as a ‘black box’ then effectively we are saying that it’s somewhat magic.

If the integrated component is being custom written then I think the team who needs to integrate with it should at the very least have someone who knows how it works very well so they can diagnose any problems quickly.

That person then needs to spread their knowledge amongst the rest of the team so that they don’t end up being the bottle neck.

In summary I think the term ‘black box’ is frequently a misnomer and we’ll rarely be able to view said black box in such an opaque way.

Written by Mark Needham

December 20th, 2011 at 11:57 pm

WebDriver: Getting it to play nicely with Xvfb

with 2 comments

Another thing we’ve been doing with WebDriver is having it run with the FirefoxDriver while redirecting the display output into the Xvfb framebuffer so that we can run it on our continuous integration agents which don’t have a display attached.

The first thing we needed to do was set the environment property ‘webdriver.firefox.bin’ to our own script which would point the display to Xvfb before starting Firefox:

import java.lang.System._
lazy val firefoxDriver: FirefoxDriver = {
  setProperty("webdriver.firefox.bin", "/our/awesome/starting-firefox.sh")
  new FirefoxDriver()
}

Our first version of the script looked like this:

/our/awesome/starting-firefox.sh

#!/bin/bash
 
rm -f ~/.mozilla/firefox/*/.parentlock
rm -rf /var/go/.mozilla
 
 
XVFB=`which xVfb`
if [ "$?" -eq 1 ];
then
    echo "Xvfb not found."
    exit 1
fi
 
$XVFB :99 -ac &
 
 
BROWSER=`which firefox`
if [ "$?" -eq 1 ];
then
    echo "Firefox not found."
    exit 1
fi
 
export DISPLAY=:99
$BROWSER &

The mistake we made here was that we started Xvfb in the background which meant that sometimes it hadn’t actually started by the time Firefox tried to connect to the display and we ended up with this error message:

No Protocol specified
Error cannot open display :99

We really wanted to keep Xvfb running regardless of whether the Firefox instances being used by WebDriver were alive or not so we moved the starting of Xvfb out into a separate script which we run as one of the earlier steps in the build.

We also struggled to get the FirefoxDriver to kill itself after each test as calling ‘close’ or ‘quit’ on the driver didn’t seem to kill off the process.

We eventually resorted to putting a ‘pkill firefox’ statement at the start of our firefox starting script:

/our/awesome/starting-firefox.sh

#!/bin/bash
 
rm -f ~/.mozilla/firefox/*/.parentlock
rm -rf /var/go/.mozilla
 
pkill firefox
 
BROWSER=`which firefox`
if [ "$?" -eq 1 ];
then
    echo "Firefox not found."
    exit 1
fi
 
export DISPLAY=:99
$BROWSER &

It’s a bit hacky but it does the job more deterministically than anything else we’ve tried previously.

Written by Mark Needham

December 15th, 2011 at 11:19 pm

Posted in Software Development

Tagged with

WebDriver: Getting it to play nicely with jQuery ColorBox

without comments

As I mentioned in an earlier post about removing manual test scenarios we’ve been trying to automate some parts of our application where a user action leads to a jQuery ColorBox powered overlay appearing.

With this type of feature there tends to be some sort of animation which accompanies the overlay so we have to wait for an element inside the overlay to become visible on the screen before trying to do any assertions on the overlay.

We have a simple method to do that:

def iWaitUntil(waitingFor: => Boolean) {
  for (i <- 1 to 5) {
    if(waitingFor) {
      return
    }
   Thread.sleep(200)
  }
}

It can then be called like this in our tests:

def driver: WebDriver = new FirefoxDriver()
 
iWaitUntil(driver.findElements(By.cssSelector(".i-am .inside-the-colorbox h3")).nonEmpty)
driver.findElement(By.cssSelector(".i-am .inside-the-colorbox h3")).getText should equal("Awesome Title")

Annoyingly what we noticed was that this wasn’t enough and although the h3 element was coming back as being visible it was then failing the following assertion because ‘getText’ was returning nothing despite the fact that it clearly had text inside it!

Uday came up with the neat idea of adding an additional wait clause which would wait until the text was non empty so we now have something like this…

def driver: WebDriver = new FirefoxDriver()
 
iWaitUntil(driver.findElements(By.cssSelector(".i-am .inside-the-colorbox h3")).nonEmpty)
driver.findElement(By.cssSelector(".i-am .inside-the-colorbox h3")).getText should equal("Awesome Title")
iWaitUntil(driver.findElement(By.cssSelector(".i-am .inside-the-colorbox h3")).getText != "")

…which seems to do the job nicely.

An alternative approach would have been to disable the animation of jQuery ColorBox just for our tests but the approach we took was a much quicker win at the time.

We did realise later on that we didn’t need to write our own wait method since WebDriver has one built into the API but I guess they both do similar things so it’s not such a big problem.

Written by Mark Needham

December 13th, 2011 at 11:31 pm

Posted in Software Development

Tagged with

The 5 Whys/Root cause analysis – Douglas Squirrel

with 4 comments

At XP Day I was chatting to Benjamin Mitchell about the 5 whys exercises that we’d tried on my team and I suggested that beyond Eric Ries’ post on the subject I hadn’t come across an article/video which explained how to do it.

Benjamin mentioned that Douglas Squirrel had recently done a talk on this very subject at Skillsmatter and as with most Skillsmatter talks there’s a video of the presentation online.

Gojko wrote a post summarising the talk at the time but I was interested in seeing how a 5 whys facilitated by Douglas would compare to the ones that we’d done.

These were some of my observations/learnings:

  • Douglas started off with a similar approach to the one we tried in our last attempt whereby he listed all the initial problems across the board and then worked through them.

    One thing he did much better was ensuring that the 5 whys were covered for each problem before moving onto the next one. He described this as ‘move down, then across‘ and made the interesting observation that when you get to the real root cause (in theory the 5th why) there will be a pause and it will hurt.

    I don’t remember noticing that in any of our 5 whys which means, Douglas suggests, that ‘you[/we] are not doing it right’. In terms of actually getting to the root cause he’s probably right but you can still learn some useful things even if you don’t dig down that far.

  • He also made the suggestions that we shouldn’t follow whys which we can immediately see are not going to go anywhere – we’d be better off going down one of the other nodes which might lead us to some useful learning.

    I think we made the mistake of following some nodes which we could tell were going to go nowhere the first time that we did the exercise and ended up reaching a 5th why which was so general that we couldn’t do anything with it.

    On the other hand I think it probably takes a couple of goes at the 5 whys before you can say with certainty that following a why is going to go nowhere.

  • Another suggestion was to ensure that everyone linked with the problem being discussed is in the room, partly so that they don’t end up being made the scape goat in absentia.

    In the two exercises we’ve run we only included the people on our immediate team and we did reach a point where it was difficult to work out what the answer to some of the whys should be because the person who could answer that question wasn’t in the room.

    It does obviously make it more logistically difficult to organise the meeting, especially if you have people working in different countries.

  • Squirrel suggested then any actions that come out of the meeting should be completable in a week which helps to ensure that they’re realistic and proportionate to the problem.

    If something goes wrong once then we don’t necessarily need to make massive changes to avoid it in future, it might be sufficient to just make some small changes and then observe if things have improved.

Overall I found the talk quite useful and it was especially helpful to be able to see how a more experienced facilitator, like Douglas, was able to guide the discussion back into the framework so that it didn’t drift off.

I’m not yet convinced that we would want to run a 5 whys exercise every week which is what I’ve heard suggested before – I think the format could quickly become dull to people as with any other meeting format when used repeatedly.

Written by Mark Needham

December 10th, 2011 at 2:11 pm

Continuous Delivery: Removing manual scenarios

without comments

On the project that I’m currently working on we’re trying to move to the stage where we’d be able to deploy multiple times a week while still having a reasonable degree of confidence that the application still works.

One of the (perhaps obvious) things that we’ve had to do as a result of wanting to do this is reduce the number of manual scenarios that our QAs need to run through.

At the moment it takes a few hours of manual testing time on top of all our automated scenarios before we can put the application in production which is fine if you release infrequently but doesn’t really scale.

Following the ideas of pain driven development we delayed automating any bits of the application if we couldn’t work out how to automate them easily.

For example we have quite a few Javascript driven light boxes which pop up on various users actions and we weren’t able to test using the HTML Unit Driver in Web Driver so we created automated tests only for the non Javascript version of those features.

We’ve started work on automating these scenarios recently and although we’re having to invest some time in fiddling around with the Firefox Driver to do so, it will save us time on each deployment so it should be worth it.

My colleague Harinee showed me an interesting post by Anand Bagmar in which he describes a cost/value matrix which can help us decide whether or not to automate something.

In this case the lightbox feature was initially only used in a couple of places and we realised it would be quite difficult to test it compared to our current approach so it fitted in the ‘Naaaah’ section (Low value, High cost).

Over time it has been introduced in more and more places so it’s become more important that we find a way to automatically test it since the application can fail in many more places if it doesn’t work.

It’s therefore moved into section #2 (High value, High cost) and become a candidate for automation.

Although not directly related to manual scenarios, one thing we haven’t quite worked out yet is what the balance should be between getting something into production and ensuring that it isn’t full of bugs.

This is also something which James Birchler addresses in his post about IMVU’s approach to continuous deployment:

We’re definitely not perfect: with constrained QA resources and a persistent drive by the team to deliver value to customers quickly, we do often ship bugs into production or deliver features that are imperfect. The great thing is that we know right away what our customers want us to do about it, and we keep on iterating.

At the moment the application is only being beta tested by a small number of users so we’re veering towards pushing something even if it’s not perfect but as we open it up to more people I imagine it’ll become a bit more stringent.

At times it’s quite painful working out what you need to change in your approach to make it possible to release frequently but it’s cool to see the progress we’re making as we get better at doing so.

Written by Mark Needham

December 5th, 2011 at 11:13 pm

The 5 whys: Another attempt

with one comment

Towards the end of the week before last and the beginning of last week we’d been having quite a few problems with our QA environment to the point where we were unable to deploy anything to it for 3 days.

A few weeks ago I wrote about a 5 whys exercise that we did in a retrospective and in our weekly code review we decided to give it a go and see what we could learn.

We started with the question ‘Why was there a mess?‘ and then branched out the first level whys since it was fairly clear that there wasn’t only one thing which had contributed to our problems.

Mess lil

We ended up with 4 answers to the first why:

  • There was a DNS change
  • Volume was deleted from our QA server
  • System tests failing
  • Change in one project hanging QA deployment
  • Main build broken for a while

We then worked across the whiteboard taking each of these in turn.

I think our approach allowed us to avoid part of ‘the cult of the root cause‘ which Don Reinertsen wrote about.

It still wasn’t quite spot on due to some mistakes I made while facilitating but these were my observations:

  • Once we got to answering the whys for the 4th and 5th first level whys the whiteboard was way too cluttered and it had become quite difficult to see exactly where we’d got up to.

    As a result we lost the discipline around answering the question why and drifted off into general discussion around the original question but stopped drilling down further looking for a potential root cause.

    The next time I think it would probably work better to look for the first why and collect any potential other whys on the same level in a ‘parking lot’ type area which we could then go to later on.

  • Having said that, a neat thing about having the whys alongside each other was that we were able to see that the first two whys were linked to each other.

    Both changes had been done by someone in the operations team based on conversations they had with people on our team.

    We realised that our communication with the operations team hadn’t been entirely clear and had left room for doubt which had led to unexpected changes being made to the servers.

    This was an example of us stopping before we’d drilled down to 5 levels having realised that we could influence the situation positively even if we hadn’t found the root cause of the problem.

  • Drilling down into the ‘System tests failing’ led to the most interesting insights:
    • System tests failing

      • Noone cares about them
        • We can push to QA even if they’re broken
        • Used to them failing
          • Perception amongst devs that they’re flaky
            • There had previously been a time when data changed frequently and broke them.
        • Seen as being owned by the QAs
          • The tests were defined by QAs
        • The time from checkin to system tests failing is quite long

    Looking back at this now we probably should have drilled a bit further down on some of the whys.

    We actually ended up discussing the perception amongst the developers that the tests were flaky and it was pointed out that most of the failures were actually real.

    We don’t currently have a ‘stop the line’ mentality if the systems tests fail but have agreed to adopt that approach for the next iteration and check at the end of this week to see if we’ve improved.

  • Even though I didn’t facilitate the exercise perfectly I think there was still a far greater level of analysis done by the team in this exercise than in others that I’ve seen.

    I’ve noticed that a lot of retrospective type exercises tend to only encourage surface level analysis so we never really go deeper into a subject and see if we can actually make some useful changes to the way that we work.

Written by Mark Needham

November 13th, 2011 at 11:08 pm

fgrep: Searching for a list of identifiers

without comments

We had a problem to solve earlier in the week where we wanted to try and find out which files we had ingested into our database based on a unique identifier.

We had a few hundred thousand files to search through to try and find the ones where around 50,000 identifiers were mentioned so that we could re-ingest them.

Running a normal grep for each identifier individually took a ridiculously long time so we needed to find a way to search for all of the identifiers at the same time to speed up the process.

Luckily my colleague knew about fgrep which allowed us to do this.

fgrep is essentially grep (or egrep) with no special characters. If you want to search for a simple string without wild cards, use fgrep. The fgrep version of grep is optimized to search for strings as they appear on the command line, so it doesn’t treat any characters as special.

We created a file containing all the identifiers:

identifiers.txt

identifier1
identifier2
identifier3

And then created the following command to identify which files those identifiers existed in:

fgrep -Rl -f identifiers.txt .

We passed the ‘-l’ flag because we don’t care where in the file the identifier matches, we just care that it exists in the file.

If we only have a few different things to search for then we could supply those directly to ‘fgrep’ without the file:

fgrep -Rl -e "identifier1" -e "identifier2" -e "identifier3" .

I haven’t used ‘fgrep’ before but it came in quite useful for us here. I also came across this article which explains the different variants of grep in more details.

Written by Mark Needham

November 10th, 2011 at 11:37 pm

Posted in Software Development

Tagged with