Mark Needham

Thoughts on Software Development

Archive for the ‘CouchDB’ tag

CouchDB: Join like behaviour with link functions

with 2 comments

I’ve been playing around with the Twitter streaming API a bit lately to see which links are being posted most frequently by the people I follow and then storing the appropriate tweets in CouchDB.

I recently came across a problem which I struggled to solve for quite a while.

Based on the following map function:

{
  "_id" : "_design/query",
  "views" : {
    "by_link" : {
      "map" : "function(doc){ emit(doc.actual_link, { user : doc.user.screen_name, text : doc.text })}"
    }
  }
}

Which results in the following data set:

curl http://127.0.0.1:5984/twitter_links/_design/query/_view/by_link?limit=20
{"total_rows":7035,"offset":0,"rows":[
{"id":"abf54db1d92bfe0e8aaaa9ec51f237bd","key":"http://2dboy.com/2011/02/08/ipad-launch/","value":{"user":"Nash","text":"World of Goo\u2019s iPad Launch http://instapaper.com/zzqrqw32e"}},
{"id":"b8911545ff45438671081260ae0d42b1","key":"http://3.bp.blogspot.com/_T6MpHfZv2qQ/SpKGGjsoQoI/AAAAAAAADIA/Jsa5JDqX9X0/s400/moleskine3.jpg","value":{"user":"oinonio","text":"@stephenfry a Babushka Little My? http://bit.ly/fjPg2a"}},
{"id":"be12d30d1c8b882d8ce0124585fabb19","key":"http://3.bp.blogspot.com/_UAzEooLfuI8/S7aOiCBdAzI/AAAAAAAAF8Y/5W61I9VHxPE/s1600-h/deforestation.jpg","value":{"user":"ironshay","text":"A big problem caused by deforestation http://bit.ly/9qArCg"}}
]}

What I want to do is go from…

  • Link Url 1 -> Tweet 1
  • Link Url 1 -> Tweet 2
  • Link Url 2 -> Tweet 3

…to…

  • Link Url 1 -> [Tweet 1, Tweet 2]
  • Link Url 2 -> [Tweet3]

I originally tried to use a reduce function after following Chris Chandler’s blog post but that resulted in a ‘reduce_overflow_error’.

Perryn pointed out that what I probably needed was a link function and I came across Chris Strom’s blog while trying to work out how to do that.

{
  "_id" : "_design/query",
  "views" : {
    "by_link" : {
      "map" : "function(doc){ emit(doc.actual_link, { user : doc.user.screen_name, text : doc.text })}"
    }
  },
  "lists" : {
    "index_tweets" : "function(head, req) {
     var row, last_key, tweets;
     send('{\"rows\" : [');
     while(row = getRow()) {
      if(last_key != row.key ) {
        if(last_key != 'undefined') {
          send(toJSON({key : last_key, values : tweets}));
          send(',');
        }
        tweets = [];
        last_key = row.key;
      } 
      tweets.push(row.value);
     }
     send(toJSON({key : last_key, values : tweets}));
     send(']}');
  }"
  }
}

We then call the list function with an associated view function following this pattern from CouchDB: The Definitive Guide:

/db/_design/foo/_list/list-name/view-name

curl http://127.0.0.1:5984/twitter_links/_design/query/_list/index_tweets/by_link

Which gives the data in the required format:

{"rows" : [
{"key":"http://1.bp.blogspot.com/_XdP6Lp2ceqY/TU16NvdT-RI/AAAAAAAAlb8/7QtTN-XxBTM/s400/dcrHk.jpg","values":[{"user":"jhartikainen","text":"RT @codepo8: The dark secret of PacMan: http://bit.ly/exCBDy"}, {"user":"joedevon","text":"RT @codepo8: The dark secret of PacMan: http://bit.ly/exCBDy"}]},
{"key":"http://10poundpom.blogspot.com/","values":[{"user":"10poundpomCL","text":"@andy_murray Help my #£10aWeekCharityChallenge, all it takes is a RT. Read http://10poundpom.blogspot.com/ for more."}]},
{"key":"http://10rem.net/blog/2011/02/09/enhancing-the-wpf-screen-capture-program-with-window-borders","values":[{"user":"brian_henderson","text":"Enhancing the WPF Screen Capture Program with Window Borders: by @Pete_Brown: http://bit.ly/icmXG5 #wpf #win32"},{"user":"SittenSpynne","text":"RT @Pete_Brown: Blogged: Enhancing the WPF Screen Capture Program with Window Borders http://bit.ly/icmXG5 #wpf #win32"}]}]}

Maybe there’s an even better way to solve this problem that I don’t know about…let me know if there is!

Written by Mark Needham

February 13th, 2011 at 5:58 pm

Posted in CouchDB

Tagged with

CouchDB: ‘badmatch’ when executing view

with one comment

I’ve been playing around with CouchDB again in my annual attempt to capture the links appearing on my twitter stream and I managed to create the following error for myself:

$ curl http://127.0.0.1:5984/twitter_links/_design/cleanup/_view/find_broken_links
{"error":"badmatch","reason":"{\n   \"find_broken_links\": {\n       \"map\": \"function(doc) {   \nvar prefix = doc.actual_link.match(/.*/);            \n  if(true) {                  emit(doc.actual_link, null);                }              }\"\n   }\n}"}

It turns out this error is because I’ve managed to create new line characters in the view while editing it inside CouchDBX. D’oh!

A better way is to edit the view in a text editor and then send it to CouchDB using curl.

The proper way to update a view would be to add a ‘_rev’ property to the body of the JSON document but I find it annoying to go and edit the document so I’ve just been deleting and then recreating my views.

$ curl -X GET http://127.0.0.1:5984/twitter_links/_design/cleanup/
{"_id":"_design/cleanup","_rev":"1-8be14d29f183b61f1ade160badef3f75","views"...}
$ curl -X DELETE http://127.0.0.1:5984/twitter_links/_design/cleanup?rev=1-8be14d29f183b61f1ade160badef3f75
{"ok":true,"id":"_design/cleanup","rev":"2-9fa15c1fdbb7cbaa659d623bc897b9da"}
$ curl -X PUT http://127.0.0.1:5984/twitter_links/_design/cleanup -d @cleanup.json
{"ok":true,"id":"_design/cleanup","rev":"17-b0763381b79f3fda843f57a7dcc842e1"}

I guess there’s probably a library somewhere which would encapsulate all that for me but I’m just hacking around at the moment.

It’s interesting to to see how you interact differently with a document database compared to what you’d do with a relational one with respect to optimistic concurrency.

Written by Mark Needham

February 12th, 2011 at 6:03 pm

Posted in CouchDB

Tagged with

CouchDB/Futon: ‘_all_dbs’ call returns databases with leading ‘c/’

with one comment

As I mentioned in my previous post I’ve been playing around with CouchDB and one of the problems that I’ve been having is that although I can access my database through the REST API perfectly fine, whenever I went to the Futon page (‘http://localhost:5984/_utils/’ in my case) to view my list of databases I was getting the following javascript error:

Database information could not be retrieved: missing

I thought I’d have a quick look with FireBug to see if I could work out what was going on and saw several requests being made to the following urls and resulting in 404s:

  • http://localhost:5984/c%2Fsharpcouch/
  • http://localhost:5984/c%2Fmark_erlang/

The value ‘c/’ was being added to the front of each of my database names, therefore meaning that Futon was unable to display the various attributes on the page for each of them.

Tracing this further I realised that the call to ‘http://localhost:5984/_all_dbs’ was actually the one that was failing, and calling it directly from ‘erl’ was resulting in the same error:

> couch_server:all_databases().
 
{ok,["c/mark_erlang","c/sharpcouch","c/test_suite_db"]}

I don’t know Erlang well enough to try and change the code to fix this problem but I came across a bug report on the CouchDB website which described exactly the problem I’ve been having.

Apparently there is a problem when you use an upper case ‘C’ for the ‘DbRootDir’ property in ‘couch.ini’. Changing that to a lower case ‘c’ so that my ‘couch.ini’ file now looks like this solved the problem:

DbRootDir=c:/couchdb/db

Written by Mark Needham

May 31st, 2009 at 11:28 pm

Posted in CouchDB

Tagged with ,

SharpCouch: Use anonymous type to create JSON objects

without comments

I’ve been playing around with CouchDB a bit today and in particular making use of SharpCouch, a library which acts as a wrapper around CouchDB calls. It is included in the CouchBrowse library which is recommended as a good starting point for interacting with CouchDB from C# code.

I decided to work out how the API worked with by writing an integration test to save a document to the database.

The API is reasonably easy to understand and I ended up with the following test:

[Test]
public void ShouldAllowMeToSaveADocument()
{
    var server = "http://localhost:5984";
    var databaseName = "sharpcouch";
    var sharpCouchDb = new SharpCouch.DB();
 
    sharpCouchDb.CreateDocument(server, databaseName, "{ key : \"value\"}");
}

In theory that should save the JSON object { key = “value” } to the database but it actually throws a 500 internal error in SharpCouch.cs:

275
HttpWebResponse resp = req.GetResponse() as HttpWebResponse;

Debugging into that line the Status property is set to ‘Protocol Error’ and a bit of Googling led me to think that I probably had a malformed client request.

I tried the same test but this time created the document to save by creating an anonymous type and then converted it to a JSON object using the LitJSON library:

[Test]
public void ShouldAllowMeToSaveADocumentWithAnonymousType()
{
    var server = "http://localhost:5984";
    var databaseName = "sharpcouch";
    var sharpCouchDb = new SharpCouch.DB();
 
    var savedDocument = new { key = "value"};
    sharpCouchDb.CreateDocument(server, databaseName, JsonMapper.ToJson(savedDocument));
}

That works much better and does actually save the document to the database which I was able to verify by adding a new method to SharpCouch.cs which creates a document and then returns the ‘documentID’, allowing me to reload it afterwards.

[Test]
public void ShouldAllowMeToSaveAndRetrieveADocument()
{
    var server = "http://localhost:5984";
    var databaseName = "sharpcouch";
    var sharpCouchDb = new SharpCouch.DB();
 
    var savedDocument = new {key = "value"};
    var documentId = sharpCouchDb.CreateDocumentAndReturnId(server, databaseName, JsonMapper.ToJson(savedDocument));
 
    var retrievedDocument = sharpCouchDb.GetDocument(server, databaseName, documentId);
 
    Assert.AreEqual(savedDocument.key, JsonMapper.ToObject(retrievedDocument)["key"].ToString());
}
public string CreateDocumentAndReturnId(string server, string db, string content)
{
    var response = DoRequest(server + "/" + db, "POST", content, "application/json");
    return JsonMapper.ToObject(response)["id"].ToString();
}

I’m not sure how well anonymous types work for more complicated JSON objects but for the simple cases it seems to do the job.

Written by Mark Needham

May 31st, 2009 at 8:59 pm

Posted in .NET,CouchDB

Tagged with ,