Silly AWK scripts

AWK is an old program, and a bit arcane as a programming language, but it's also very simple to use if you have the right mental model. gawk (GNU AWK) is also available, but I will be referring to the classic version mostly.

The command line options are specified in the awk reference, and that also includes the basic program structure and expressions.

I like to write one-offs to process regular text files. I mean 'regular text file' as in 'text file following rules', not as 'any plain old text file' - if you need to get fancy when processing something, Python is a better language for quick and dirty programs - I've written about text processing in Python before.

The way I approach writing awk is usually with a one-off command line I build up somewhere so I can touch up and paste into a terminal to run.

Building an AWK program bit by bit

The starting structure is usually something like this:

awk '{print}' file.txt

This will just print out each line in file.txt - that's what the '{print}' program is. It's time to start writing something up, and here are the basic rules.

Often I will use awk to count specific lines or to extract some information from those lines.

I use the BEGIN pattern to initialize counters, then match with regular expressions to increment, and write at the END pattern.

For example, this will count top-header lines in a markdown file.

awk 'BEGIN {h=0}; /^# (.*)$/ {h+=1}; END {print "headers: " h}' file.md

NOTE: in Windows, the caret (^) character is also used as an escape character, so you would have to double it like this:

awk 'BEGIN {h=0}; /^^# (.*)$/ {h+=1}; END {print "headers: " h}' file.md

To print them out:

awk 'BEGIN {h=0}; /^# (.*)$/ {print;h+=1}; END {print "headers: " h}' file.md

Now, if this is all you wanted to do, grep and wc have you covered. awk shines when you need to keep state as you go through your file.

Let's say I only want to count headers after a CONTENT STARTS line.

awk 'BEGIN {c=0;h=0}; /^# (.*)$/ {if (c) {h+=1}}; /CONTENT STARTS/ {c=1} END {print "headers: " h}' file.md

If you want to see more examples, golinuxcloud has some good ones.

Happy AWK scripting!

Tags:  codingshell

Home