Mark Needham

Thoughts on Software Development

Prim’s Algorithm in Ruby

with 4 comments

One of the first programming assignments of the Algorithms 2 course was to code Prim’s algorithm – a greedy algorithm used to find the minimum spanning tree of a connected weighted undirected graph.

In simpler terms we need to find the path of least cost which connects all of the nodes together and there can’t be any cycles in that path.

Wikipedia has a neat diagram which shows this more clearly:

The pseudocode for the algorithm is as follows:

  • Let X = nodes covered so far, T = edges covered so far, V = all the nodes in the graph
  • Pick an initial arbitrary node s – it doesn’t matter which one it is
  • while XV:
    • let e = (u,v) be the cheapest edge of the graph where uX and v ∉ X
      i.e. u is a node that has been covered and v a node that has not yet been covered
    • Add e to T
    • Add v to X

At the end of the algorithm we’ll have a collection of all the nodes in the graph and a collection of the edges we need to follow in order to create a minimal spanning tree. If we sum the weights of those edges we get the cost of the tree.

I used an adjacency matrix to represent the graph i.e. a 2 dimensional array of size n*n (where n = number of nodes in the graph).

If node 0 had an edge to node 3 of weight 4 then we’d put this entry into the matrix:

adjacency_matrix[0][3] = 4

I tried to get my implementation of the algorithm to look as close to the pseudocode as possible and I ended up with the following:

adjacency_matrix = create_adjacency_matrix
first_edge = select_first_edge(adjacency_matrix)
@nodes_spanned_so_far, @edges = [first_edge[:start], first_edge[:end]], [first_edge]
while !nodes_left_to_cover.empty?
  cheapest_edge = find_cheapest_edge(adjacency_matrix, @nodes_spanned_so_far, number_of_nodes)
  @edges << cheapest_edge
  @nodes_spanned_so_far << cheapest_edge[:start]

The code became a bit messy in parts because it relies on a 0 indexed array yet the names of the nodes start at 1. There’s therefore loads of +1s and -1s dotted around the place.

The method to work out the next cheapest node looks like this:

def find_cheapest_edge(adjacency_matrix, nodes_spanned_so_far, number_of_nodes)
  available_nodes = (0..number_of_nodes-1).to_a.reject { |node_index| nodes_spanned_so_far.include?(node_index + 1) }
  cheapest_edges = available_nodes.inject([]) do |acc, node_index|
    get_edges(adjacency_matrix, node_index).select { |_, other_node_index| nodes_spanned_so_far.include?(other_node_index + 1) }.each do |weight, other_node_index|
      acc << { :start => node_index + 1, :end => other_node_index + 1, :weight => weight }
  cheapest_edges.sort { |x,y| x[:weight]  y[:weight] }.first
def get_edges(adjacency_matrix, node_index)
  adjacency_matrix[node_index].each_with_index.reject { |edge, index| edge.nil? }

We first get all the nodes which haven’t already been spanned and then build up a collection of the edges between nodes we’ve already spanned and ones that we haven’t.

The other bit of interesting code is the creation of the adjacency matrix at the beginning:

def create_adjacency_matrix
  adjacency_matrix = [].tap { |m| number_of_nodes.times { m << } }
  file.drop(1).map { |x| x.gsub(/\n/, "").split(" ").map(&:to_i) }.each do |(node1, node2, weight)|
    adjacency_matrix[node1 - 1][node2 - 1] = weight
    adjacency_matrix[node2 - 1][node1 - 1] = weight

Here we are first parsing the file which involves skipping the first header line and the converting it into a collection of integer arrays representing the two nodes and their corresponding edge weight.

We then put two entries into the adjacency matrix, one entry from node A to node B and one entry from node B to node A. The reason we do this is that this is an undirected graph so we can go either way between the nodes.

To work out the cost of the minimum spanning tree I use this line of code at the end:

puts "edges: #{@edges}, total spanning tree cost #{@edges.inject(0) {|acc, edge| acc + edge[:weight]}}"

My full solution is on github so if anyone has any suggestions/improvements they’re always welcome.

Be Sociable, Share!

Written by Mark Needham

December 15th, 2012 at 2:51 am

Posted in Algorithms

Tagged with

  • Pingback: Prim’s algorithm using a heap/priority queue in Ruby at Mark Needham()

  • I’m slightly surprised you can post your code. One of my frustrations with the recent Coursera Functional Programming In Scala course was that solutions aren’t available, so I could write my stuff, get whatever mark, but then not see how other people had solved the problem, or even what a model answer from the instructors looked like. Some people did post code in public repositories on github and apparently were ejected from the course.

  • @twitter-8145762:disqus yeh I don’t think you’re supposed to post the solutions but then again this is a very common algorithm and you can find loads of solutions through Googling so I figure it’s not a big deal? 

    Having said that I did get told off for posting my solution to a problem in Algorithms 1 but I don’t really buy the arguments to be honest. If you’re doing the problems to learn then it’s pretty pointless copying what someone else has done and I find that I learn way more from talking about code with other people and seeing what they did differently and so on…

    So I’ll probably get ejected now 😛 

  • Pingback: Kruskal’s Algorithm in Ruby at Mark Needham()