Mark Needham

Thoughts on Software Development

Sed across multiple files

with 12 comments

Pankhuri and I needed to rename a method and change all the places where it was used and decided to see if we could work out how to do it using sed.

We needed to change a method call roughly like this:

home_link(current_user)

To instead read:

homepage_path

For which we need the following sed expression:

sed -i 's/home_link([^)]*)/homepage_path/' [file_name]

Which works pretty well if you know which file you want to change but we wanted to run it over the whole code base.

A bit of googling led us to this thread on devshed which suggested we’d need to get a list of the files and then run sed through the list:

for file in `find .  -type f`; do sed -i 's/home_link([^)]*)/homepage_path/' $file; done

That pretty much works but it doesn’t play nicely if the file has a space in the name since sed thinks the file name has ended before it actually has.

I was pretty sure that we should be able to pipe the output of the find into xargs and a bit more googling led us to the following solution:

find . -type f -print0 | xargs -0 sed -i 's/home_link([^)]*)/homepage_path/'

The ‘print0′ flag is described like so:

This primary always evaluates to true.  It prints the pathname of the current file to standard output, followed by an ASCII NUL character (character code 0).

While ‘-0′ in ‘xargs’ is described like this:

  -0      Change xargs to expect NUL (``\0'') characters as separators, instead of spaces and newlines.  This is expected to be used in concert with the -print0 function in find(1).

It also runs amazingly fast!

If anyone knows a better way feel free to point it out in the comments.

Written by Mark Needham

January 11th, 2011 at 4:43 pm

Posted in Shell Scripting

Tagged with

  • http://lixo.org Carlos Villela

    An alternative that is potentially less safe than the -print0 option, but easier to remember (at least to me), is to just double-quote the $file variable use. This way, bash will still expand it, but leave it as a single command-line argument to sed. Like this:

    for file in `find . -type f`; do sed -i ‘s/home_link([^)]*)/homepage_path/’ “$file”; done

    SIMPLES!

  • Paul Symons

    Another way you can do it is by using bash’s while loop. So
    instead of find . -type f -print0 | xargs -0 sed -i
    ‘s/home_link([^)]*)/homepage_path/’ You could also do find . -type
    f -print0 | while read fname ; do sed -i
    ‘s/home_link([^)]*)/homepage_path/’ $fname ; done

  • http://www.markhneedham.com/blog Mark Needham

    @Carlos – I did try putting the quotes around the $file but I still see the same problem, not sure why :S

    @Paul – cool I didn’t know about the while loop, looks neat.

  • http://thinkaround.blogspot.com Saager Mhatre

    dude, what…? nooo!! why are people running loops over find! use the -exec option!
    http://en.wikipedia.org/wiki/Find#Execute_an_action

  • http://thinkaround.blogspot.com Saager Mhatre

    so the command would look like so

    find . -type f -exec sed -ie ‘s/home_link([^)]*)/homepage_path/’ {} \;

    just *one* command, muchacho!

  • Devdas Bhagat

    # IFS stands for Internal Field Separator. It defaults to whitespace
    # We save the value of $IFS.
    export OLDIFS=$IFS

    # Set IFS to a newline.
    IFS=’

    find . -type f | xargs sed -i -e ‘s/foo/bar/g’ $fname

    Or you could juse use Perl.

    IFS=’

    perl -pi -e ‘s/foo/bar/g’ `find . -type f`

  • Devdas Bhagat

    Saager, -exec will execute sed once for every file you pass to it. xargs will pass multiple filenames, reducing the count of sed invokations.

  • http://thinkaround.blogspot.com Saager Mhatre

    @Devdas, it’s sed, with a single substitution, being run just once, do we really care how many invocations of it there are going to be? :)

    besides, i’ve used it with (much) larger expressions over quite a list of files to good effect in case you were worried about performance.

    the idea being that i essentially have only one command for finding and looping. there’s really no need to pipe together more.

    xargs has its use cases, but -exec and -execdir help me keep the command terse enough that i can whip it out in my sleep.

  • http://www.markhneedham.com Mark Needham

    @Devdas – I didn’t know about IFS, that looks pretty clever…and now I have to learn Perl as well?! :-P

    @Saager – I’ll have to give it a try while timing the replacement to see if it’s any different! But like you say I doubt it’s gonna be significant.

  • Devdas Bhagat

    -exec {} doesn’t hurt for a few dozen files. It kills you when you have a million or two.

    Learning Perl is always good. Especially once you cross the need to golf and start writing higher order Perl. Or OO Perl with Moose.

    Just avoid http://99-bottles-of-beer.net/language-sendmail-588.html and http://99-bottles-of-beer.net/language-perl-737.html as style guides.

  • Pingback: Sed: ‘sed: 1: invalid command code R’ on Mac OS X at Mark Needham

  • Pingback: Mark Needham: Sed: ‘sed: 1: invalid command code R’ on Mac OS X | Software Secret Weapons