Mark Needham

Thoughts on Software Development

Archive for the ‘twitter’ tag

Neo4j: Graphing the ‘My name is…I work’ Twitter meme

without comments

Over the last few days I’ve been watching the chain of ‘My name is…’ tweets kicked off by DHH with interest. As I understand it, the idea is to show that coding interview riddles/hard tasks on a whiteboard are ridiculous.

Other people quoted that tweet and added their own piece and yesterday Eduardo Hernacki suggested that traversing this chain of tweets seemed tailor made for Neo4j.

Michael was quickly on the scene and created a Cypher query which calls the Twitter API and creates a Neo4j graph from the resulting JSON response. The only tricky bit is creating a ‘bearer token’ but Jason Kotchoff has a helpful gist showing how to generate one from your Twitter consumer key and consumer secret.

Now that we’re got our bearer token let’s create a parameter to store it. Type the following in the Neo4j browser:

:param bearer: '<your-bearer-token-goes-here>'

Now we’re ready to query the Twitter API. We’ll start with the search API and find all tweets which contain the text ‘”my name” “I work”‘. That will return a JSON response containing lots of tweets. We’ll then create a node for each tweet it returns, a node for the user who posted the tweet, a node for the tweet it quotes, and relationships to glue them all together.

We’re going to use the apoc.load.jsonParams procedure from the APOC library to help us import the data. If you want to follow along you can use a Neo4j sandbox instance which comes with APOC installed. For your local Neo4j installation, grab the APOC jar and put it into your plugins folder before restarting Neo4j.

This is the query in full:

WITH 'https://api.twitter.com/1.1/search/tweets.json?count=100&result_type=recent&lang=en&q=' as url, {bearer} as bearer
 
CALL apoc.load.jsonParams(url + "%22my%20name%22%20is%22%20%22I%20work%22",{Authorization:"Bearer "+bearer},null) yield value
 
UNWIND value.statuses as status
WITH status, status.user as u, status.entities as e
WHERE status.quoted_status_id is not null
 
// create a node for the original tweet
MERGE (t:Tweet {id:status.id}) 
ON CREATE SET t.text=status.text,t.created_at=status.created_at,t.retweet_count=status.retweet_count, t.favorite_count=status.favorite_count
 
// create a node for the author + a POSTED relationship from the author to the tweet
MERGE (p:User {name:u.screen_name})
MERGE (p)-[:POSTED]->(t)
 
// create a MENTIONED relationship from the tweet to any users mentioned in the tweet
FOREACH (m IN e.user_mentions | MERGE (mu:User {name:m.screen_name}) MERGE (t)-[:MENTIONED]->(mu))
 
// create a node for the quoted tweet and create a QUOTED relationship from the original tweet to the quoted one
MERGE (q:Tweet {id:status.quoted_status_id})
MERGE (t)–[:QUOTED]->(q)
 
// repeat the above steps for the quoted tweet
WITH t as t0, status.quoted_status as status WHERE status is not null
WITH t0, status, status.user as u, status.entities as e
 
MERGE (t:Tweet {id:status.id}) 
ON CREATE SET t.text=status.text,t.created_at=status.created_at,t.retweet_count=status.retweet_count, t.favorite_count=status.favorite_count
 
MERGE (t0)-[:QUOTED]->(t)
 
MERGE (p:User {name:u.screen_name})
MERGE (p)-[:POSTED]->(t)
 
FOREACH (m IN e.user_mentions | MERGE (mu:User {name:m.screen_name}) MERGE (t)-[:MENTIONED]->(mu))
 
MERGE (q:Tweet {id:status.quoted_status_id})
MERGE (t)–[:QUOTED]->(q);

The resulting graph looks like this:

MATCH p=()-[r:QUOTED]->() RETURN p LIMIT 25

Graph  21

A more interesting query would be to find the path from DHH to Eduardo which we can find with the following query:

match path = (dhh:Tweet {id: 834146806594433025})<-[:QUOTED*]-(eduardo:Tweet{id: 836400531983724545})
UNWIND NODES(path) AS tweet
MATCH (tweet)<-[:POSTED]->(user)
RETURN tweet, user

This query:

  • starts from DHH’s tweet
  • traverses all QUOTED relationships until it finds Eduardo’s tweet
  • collects all those tweets and then finds the author
  • returns the tweet and the author

And this is the output:

Graph  20

I ran a couple of other queries against the Twitter API to hydrate some nodes that we hadn’t set all the properties on – you can see all the queries on this gist.

For the next couple of days I also have a sandbox running https://10-0-1-157-32898.neo4jsandbox.com/browser/. You can login using the credentials readonly/twitter.

If you have any questions/suggestions let me know in the comments, @markhneedham on twitter, or email the Neo4j DevRel team – devrel@neo4j.com.

Written by Mark Needham

February 28th, 2017 at 3:50 pm

Posted in neo4j

Tagged with ,

Twitter as a learning tool

with 4 comments

About 8 or 9 months ago I remember having a conversation with a colleague where I asked him where he had got his almost encyclopedic knowledge of all things software development.

His reply at the time was that he read a lot of blogs and that this was where he had picked up a lot of the information.

While subscribing to different blogs remains a useful way of learning about different aspects of software development, I think Twitter is now becoming a very useful complementary tool to use alongside the RSS reader.

I originally thought of Twitter merely as an extension of Facebook status, but several bloggers, most noticeably Roy Osherove, have been using Twitter almost as a cutting floor for blog articles or upcoming books.

Several blog aggregations have also started posting updates onto Twitter including Los Techies, Elegant Code, Code Better and Planet TW, which I set up at the end of last week. While this doesn’t remove the need for subscribing to the feed it provides a constant stream of blogs to read rather than the batch reading process I tend to use when reading posts from Google Reader.

Following software development authors is another way to keep in touch with what the best in the field are working Jurgen Appelo has removed the need to find them all by creating a top 50 list of software development Tweeters. This use of Twitter is also encouraged in the upcoming book Apprenticeship Patterns.

Since I started using Twitter I have come across quite a lot of interesting content that I may not have come across otherwise:

  • Frequent conversations about REST and DDD between serialseb and colinjack. I’ve not used REST as much as these guys have but it’s interesting to follow the conversation and it provides a potential future reference if and when I do work in this area.
  • Jeremy Miller explaining the ‘one in one out’ part of his Thunderdome approach to using ASP.NET MVC which I did not quite understand from reading his blog
  • A link to a post about natural talent by Guy Kawasaki – I find theories of learning intriguing so it was interesting to read an angle on the subject which talked about how being good at something might actually work against you.
  • Learning about Malcolm Gladwell’s new book Outliers from following Steven ‘Doc’ List and Jason Yip’s Twitter feeds. I probably would have come across this eventually but the process was shortened thanks to Twitter.

There has been some discussion on the Alt.NET mailing list with regards to whether Twitter is killing its use. I haven’t followed it for long enough to say whether that’s the case but one of the more valid cases against Twitter is that it is hard to follow message trails after the event – you tend to need to be there at the time the discussion is happening to get the most value from it.

Trying to find the right balance of noise to signal ratio is also something that has to be managed but overall I think Twitter is a useful platform for learning and getting to know what is happening in the rest of the software development universe.

Written by Mark Needham

December 7th, 2008 at 10:30 pm