31 Dec 2012

Bitwise operations in Ruby and Haskell

Part of one of the most recent problems in the Algorithms 2 course required us to find the 'neighbours' of binary values.

In this case a neighbour is described as being any other binary value which has an equivalent value or differs in 1 or 2 bits.

e.g. the neighbours of '10000' would be '00000', '00001', '00010', '00100', ''01000', '10001', '10010', '10011', '10100', '10101', '10110', '11000', '11001', '11010' and '11100'_~

I initially treated '10000' as an array of 1s and 0s and wrote a function to recursively come up with the above combinations before it was pointed out to me that it’d be much easier to use bit wise logic instead.

I don’t remember every having to write any code using bit wise operations since university so I thought I’d document what I learnt.

The first thing to do is convert those binary values into a decimal equivalent which is pretty easy in Ruby:

> "10000".to_i(2) \=> 16

In Haskell I stole a function from PLEAC to do the job:

> import Data.Char
> (foldr (\c s \-> s * 2 + c) 0 . reverse . map digitToInt) "10000" 16

I initially worked out the neighbours by hand but I need to work out how to convert that into code so I ran through an example.

We want to get from '10000' (16) to '00000' (0) which we can do by using an XOR operation:

Binary XOR Operator copies the bit if it is set in one operand but not both.

 '10000' XOR '10000'

In Ruby it would read like this:

> 16 {caret} 16 \=> 0

If we do the same XOR operation but changing the other bits instead we end up with all the neighbours of distance 1:

> [0,1,2,4,16].map { |x| 16 {caret} x } \=> [16, 17, 18, 20, 0]

We can generalise the function that creates that array of values like so:

> bits = 5 > offsets = (0..(bits - 1)).map { |x| 2 ** x } \=> [1, 2, 4, 8, 16]

or if we want to use bit shifting:

> offsets = (0..(bits - 1)).map { |x| 1 << x } \=> [1, 2, 4, 8, 16]

With that approach we’re moving the "left operands value is moved left by the number of bits specified by the right operand." i.e. we move '1' left by 0 bits (from '1’to '1'), then by 1 bit (from '1' to '10'), then by 2 bits (from '1' to '100') and so on.

We still need to get all the neighbours which differ by 2 bits which we can do by getting all the ways that we can combine those offsets together. The combination function and Bitwise or do the trick here:

> offsets.combination(2).to_a \=> [[1, 2], [1, 4], [1, 8], [1, 16], [2, 4], [2, 8], [2, 16], [4, 8], [4, 16], [8, 16]]
> offsets.combination(2).to_a.map { |a,b| a|b } \=> [3, 5, 9, 17, 6, 10, 18, 12, 20, 24]

Now if we put it all together:

> initial_value = "10000" \=> "10000"
> bits = initial_value.length \=> 5
> value = "10000".to_i(2) \=> 16
> single_bit_offsets = (0..(bits - 1)).map { |x| 1 << x } \=> [1, 2, 4, 8, 16]
> all_the_offsets = offsets + offsets.combination(2).to_a.map { |a,b| a|b } \=> [1, 2, 4, 8, 16, 3, 5, 9, 17, 6, 10, 18, 12, 20, 24]
 > all_the_offsets.map { |offset| value {caret} offset }.sort \=> [0, 1, 2, 4, 8, 17, 18, 19, 20, 21, 22, 24, 25, 26, 28]

The final Haskell version looks like this:

> import Data.Char
> import Data.Bits
> import Data.List
> let initialValue = "10000"
> let bits = length initialValue
> let value = (foldr (\c s \-> s * 2 + c) 0 . reverse . map digitToInt) initialValue
> let singleBitOffsets = map (shiftL 1) [0..(bits - 1)] :: [Int] > let allTheOffsets = singleBitOffsets ++ map (\(x:y:_) \-> (x .|. y)) (combinationsOf 2 singleBitOffsets) :: [Int]
> Data.List.sort $ map (xor value) allTheOffsets [0,1,2,4,8,17,18,19,20,21,22,24,25,26,28]

I took the combinationsOf function from David Amos' Combinatorics Generation library and it reads like so: _~haskell — subsets of size k combinationsOf 0 _ = [[]] combinationsOf _ [] = [] combinationsOf k (x:xs) = map (x:) (combinationsOf (k-1) xs) ++ combinationsOf k xs _~

A real life version of this problem occurs in the world of genome analysis where scientists need to work out whether phylogenetic profiles are functionally linked.

About the author

I'm currently working on short form content at ClickHouse. I publish short 5 minute videos showing how to solve data problems on YouTube @LearnDataWithMark. I previously worked on graph analytics at Neo4j, where I also co-authored the O'Reilly Graph Algorithms Book with Amy Hodler.