Making Friends With Command Line Perl

From Devpit
Jump to: navigation, search

Making friends with command line tools usually pays off. There's other documentation to explain what the components do, but often tying things together requires a little experience, so here are some quick examples to learn from. Usually, you start with a simple task, and you quickly realize there's a special case to deal with, or you need to apply this to all the files in a directory, etc. In any case, these examples start simple and grow, just like command lines in real life usually do.

This is ZSH syntax.

Beginner's Example

Switching leading spaces to tabs in a source file:

perl -pwe '0 while s/^(\t*) {8}/$1\t/gs' < /tmp/old > /tmp/new

Maybe you'd like to see the diff between the old and new immediately:

perl -pwe '0 while s/^(\t*) {8}/$1\t/gs' < /tmp/old | tee /tmp/new | diff /tmp/old -

What if you have 20 files in a directory to do this for?

mkdir /tmp/new
for file in *; do perl -pwe '0 while s/^(\t*) {8}/$1\t/gs' < ${file} > /tmp/new/${file}; done


Renaming Lots Of Files

Let's say you want to lowercase all the PNG file extensions because an annoying windows machine goofed them up. Remove the echo if you're ready to commit, but always do a test run first with the echo to be sure you haven't made a typo. It won't be long before you're glad you did.

unsetopt sh_word_split
for oldname in *.PNG; do (echo $oldname | perl -pwe 's/\.PNG$/.png/gs' | read newname && echo mv -iv $oldname $newname) || break; done

To lowercase all file extensions:

unsetopt sh_word_split
for oldname in *; do (echo $oldname | perl -pwe 's/\.([^\.]+)$/.\L$1/gs' | read newname && echo mv -iv $oldname $newname) || break; done

If $oldname == $newname, mv will ask you to overwrite. If this troubles you, it's easy to work around. If you're confident and lazy change -i to -f. Otherwise test x$oldname != x$newname. This starts to get a bit clumsy for beginners though:

unsetopt sh_word_split
for oldname in *; do (echo $oldname | perl -pwe 's/\.([^\.]+)$/.\L$1/gs' | read newname && if test x$oldname != x$newname; then mv -fv $oldname $newname; fi) || break; done

CSV Files

Here's a quick way to deal with CSV files properly. Sure splitting on comma roughly works, but only when you're sure there aren't any commas in the fields.

Let's say you want to clear columns 9 and 10 when they duplicate columns 7 and 8. This separates each line ($_) into an array of fields (@_), then combines it back. The BEGIN block isn't entirely necessary in this example (you could instead do the setup code during each iteration), but demonstrates how to do some setup code before reading the first line. Note that you can replace the if() statement with whatever transformation you're interested in.

perl -pwe 'BEGIN {use Text::CSV_XS; $csv = Text::CSV_XS->new(); $csv->always_quote(1);} $csv->parse($_); @_ = $csv->fields(); if($_[7] eq $_[9] and $_[8] eq $_[10]) {$_[9] = ""; $_[10] = "";} $csv->combine(@_); $_ = $csv->string() . "\n"'

Notes

  • Always use single-quotes rather than double-quotes around the Perl source to avoid accidental variable interpolation by the shell.
    • To escape single-quotes, use '\'' to end the quoted string, add a single-quote, and reopen the quoted string.
  • sh_word_split is evil, but on by default. Consider turning it off in .zshenv. You can use something like this on the occasion you need it: `echo $string_to_split`. (Note how the back-quotes do not quote the result from echo, causing word-splitting.)