Mark Needham

Thoughts on Software Development

Archive for the ‘xml’ tag

Parsing XML from the unix terminal/shell

without comments

I spent a bit of time today trying to put together a quick script which would allow me to grab story numbers from the commits in our Git repository and then work out which functional areas those stories were in by querying mingle.

Therefore I wanted to make a curl request to the mingle and then pipe that result somewhere and run an xpath expression to get my element.

I didn’t want to have to write code in another script file and then reference that file from the shell and in my search to achieve that I came across XMLStarlet on stackoverflow.

It’s installable via mac ports:

sudo port install xmlstarlet

And I was then able to pipe the results of my mingle request and locate the following bit of XML:

<property type_description="Managed text list" hidden="false">
<name>Functional Area</name>
<value>Our Functional Area</value>
</property>
curl -s http://user:password@mingleurl:8888/api/v2/projects/project_name/cards/1.xml  | xmlstarlet sel -t -v "//property/name[. = 'Functional Area']/../value"

There’s much more you can do with the command which is listed on the documentation page.

Written by Mark Needham

September 3rd, 2011 at 11:42 pm

Querying Xml with LINQ – Don’t forget the namespace

with one comment

I’ve been working with a colleague on parsing a Visual Studio project file using LINQ to effectively create a DOM of the file.

The first thing we tried to do was get a list of all the references from the file. It seemed like a fairly easy problem to solve but for some reason nothing was getting returned:

1
2
3
4
XDocument projectFile = XDocument.Load(projectFilePath.Path);
 
var references = from itemGroupElement in projectFile.Descendants("ItemGroup").First().Elements()
                 select itemGroupElement.Attribute("Include").Value;

We are selecting all the occurrences of ‘ItemGroup’, taking the first occurrence, getting all the elements inside it (i.e. all the Reference elements) and then selecting the value of the ‘Include’ attribute. A fragment of the csproj file is as follows:

1
2
3
4
5
6
7
<ItemGroup>
	<Reference Include="System" />
	<Reference Include="System.Core">
		<RequiredTargetFramework>3.5</RequiredTargetFramework>
	 </Reference>
...
</ItemGroup>

After several hours of trial and error it turned out that we just needed to include the namespace of the file when querying. The new and now working code looks like this:

1
2
3
4
5
XNamespace projectFileNamespace = "http://schemas.microsoft.com/developer/msbuild/2003";
XDocument projectFile = XDocument.Load(projectFilePath.Path);
 
var references = from itemGroupElement in projectFile.Descendants(projectFileNamespace + "ItemGroup").First().Elements()
                 select itemGroupElement.Attribute("Include").Value;

There are two quite clever things going on with the way this is done

1) There is an implicit type conversion defined on XNamespace which allows us instatiate it using a string.
2) The addition(+) operator has been overloaded on XNamespace so that it can combine the namespace with the local name (‘ItemGroup’). This is described in more detail here.

Written by Mark Needham

August 28th, 2008 at 10:15 am

Posted in .NET

Tagged with , , , ,