Extra fun fact: the program’s odd name is an initialism of its programmers’ names: Alfred Aho, Peter Weinberger, and Brian Kernighan

Awk’s basic syntax

When invoked on the command line, awk follows the basic pattern below: Awk will execute the action whenever the pattern is matched within the file specified. If you don’t specify a file, awk will run on the standard output. When matching patterns, awk can take regular expressions as well as programmatic input. Let’s consider this basic example below:

This one-line program will print each line from the file “emails” that contain the characters com. In awk $0 refers to the current line, which is also the default behavior. The line could have been written without $0, and it would have functioned identically.

Printing fields

Because awk can identify and parse field separators, it’s useful for printing out specific columns or rows of data. We will use the “/etc/passwd” file for this example.

This one-line program does a few things. The flag -F indicates that the next character (: in this example) should be interpreted as the field separator. Awk them prints the first field, specified by $1. We can also print more than one field at a time by specifying the fields sequentially: It will produce output that looks like the following.

This prints the fourth and fifth fields of the passwd file with a space between them. Note that the space is between double quotes. This specifies it as a literal character within the print command, so it’s printed as written. We can also add more complicated literals to clean up our output:

This will print the output with labels for identification. And we can output all of this to a new file using a caret (>). We can combine what we know so far to process data extensively. For example, we can use regular expressions to print all lines from a document that contains a valid US phone number.

Expanding the Awk command’s matching power

Awk can also process information using a variety of operands. This includes standard operands like ==, <, >, <=, >=, and !=, as well as awk-specific operands ~ and !~, which mean “matches” and “does not match” respectively. These operands are used when comparing regular expressions with Boolean logic, as well as more standard programmatic phrases.

Awk Command Examples

Prints all lines longer than eighty characters in the file “data.” Note the lack of a print statement: in the absence of a specified action, awk will print the full line whenever a pattern matches. Prints all lines where the first field equals the string “user.” Without an -F flag, awk will use white space as the default field separator. Also, note that awk and the file are not specified. This is for use in scripts in separate files, as covered below. Prints the third field whenever the fifth field matches the regular expression /root/. When field 5 does not match /root/, print field three. This uses the C-like if statement, which is also compatible with awk. This format allows for more flexibility for programmers familiar with general purpose languages.

Saving scripts in files

Awk scripts can also be saved in files which allow you to save more complex programs: When using the -f flag, awk runs the script in the specified file path, namely program.awk. The commands in that program will process the file “data.” Actions can also be run before and after the program, using BEGIN and END: As you can see above, the # symbol starts a comment, which lasts until the end of the line.

Conclusion

This guide only touches on the most basic elements of awk. There’s far more to build and explore beyond this. Examine the GNU documentation for awk or The Awk Programming Language, and the awk textbook written by the developers of the program.