Mark Needham

Thoughts on Software Development

Sed: Replacing characters with a new line

with 5 comments

I’ve been playing around with writing some algorithms in both Ruby and Haskell and the latter wasn’t giving the correct result so I wanted to output an intermediate state of the two programs and compare them.

I didn’t do any fancy formatting of the output from either program so I had the raw data structures in text files which I needed to transform so that they were comparable.

The main thing I wanted to do was get each of the elements of the collection onto their own line. The output of one of the programs looked like this:

[(1,2), (3,4)…]

To get each of the elements onto a new line my first step was to replace every occurrence of ‘, (‘ with ‘\n(‘. I initially tried using sed to do that:

sed -E -e 's/, \(/\\n(/g' ruby_union.txt

All that did was insert the string value ‘\n’ rather than the new line character.

I’ve come across similar problems before and I usually just use tr but in this case it doesn’t work very well because we’re replacing more than just a single character.

I came across this thread on Linux Questions which gives a couple of ways that we can get see to do what we want.

The first suggestion is that we should use a back slash followed by the enter key while writing our sed expression where we want the new line to be and then continue writing the rest of the expression.

We therefore end up with the following:

sed -E -e "s/,\(/\
/g" ruby_union.txt

This approach works but it’s a bit annoying as you need to delete the rest of the expression so that the enter works correctly.

An alternative is to make use of echo with the ‘-e’ flag which allows us to output a new line. Usually backslashed characters aren’t interpreted and so you end up with a literal representation. e.g.

$ echo "mark\r\nneedham"
mark\r\nneedham
 
$ echo -e "mark\r\nneedham"
mark
needham

We therefore end up with this:

sed -E -e "s/, \(/\\`echo -e '\n\r'`/g" ruby_union.txt

** Update **

It was pointed out in the comments that this final version of the sed statement doesn't actually lead to a very nice output which is because I left out the other commands I passed to it which get rid of extra brackets.

The following gives a cleaner output:

$ echo "[(1,2), (3,4), (5,6)]" | sed -E -e "s/, \(/\\`echo -e '\n\r'`/g" -e 's/\[|]|\)|\(//g'
1,2
3,4
5,6
Be Sociable, Share!

Written by Mark Needham

December 29th, 2012 at 5:49 pm

Posted in Shell Scripting

Tagged with

  • http://franklinchen.com/ Franklin Chen

    I’m confused. What’s the intended output? I copied and pasted the final sed line and ran and got stuff looking like

    [(1,2)

    3,4)

    5,6)]

  • http://www.markhneedham.com/blog Mark Needham

    @franklinchen:disqus that was just the first bit of the stuff that I deleted as I just wanted to record how you get the new line bit to work. 

    The whole thing was:

    $ echo “[(1,2), (3,4), (5,6)]” | sed -E -e “s/, (/\`echo -e ‘nr’`/g” -e ‘s/[|]|)|(//g’

    That gives you:

    1,2
    3,4
    5,6

    I was then compared the two files with vimdiff and it showed any differences between the two programs.

  • Pingback: Geek Reading December 31, 2012 | Regular Geek

  • phoenix_fr

    Hi, i don’t get it.
    Your 1st try with sed was almost good. You just needed to put n instead of \n (or nr if you need that).

    So the entire sed command should look like:
    echo “[(1,2), (3,4), (5,6)]” | sed -E -e ‘s/, (/n/g’ -e ‘s/[|]|)|(//g’
    or
    echo “[(1,2), (3,4), (5,6)]” | sed -E -e ‘s/, (/nr/g’ -e ‘s/[|]|)|(//g’

    Works well on my Debian GNU/Linux 6.0.6 (squeeze).
    Am i missing something ?

  • http://www.markhneedham.com/blog Mark Needham

    @c04b5d25710839d9d03545008eb0579c:disqus I tried your version on Mac OS X and this is what I get:

    $ echo “[(1,2), (3,4), (5,6)]” | sed -E -e ‘s/, (/nr/g’ -e ‘s/[|]|)|(//g’1,2nr3,4nr5,6

    It doesn’t seem to interpret the carriage return/new line as that but instead literally or something?