Linux Classes
Linux Training
* Linux Classes *

LINUX CLASSES - CLASSES - DATA MANIPULATION

Can Awk Fight Global Warming?

The awk command combines the functions of grep and sed, making it one of the most powerful Unix commands. Using awk, you can substitute words from an input file's lines for words in a template or perform calculations on numbers within a file. (In case you're wondering how awk got such an offbeat name, it's derived from the surnames of the three programmers who invented it.)

To use awk, you write a miniature program in a C-like language that transforms each line of the input file. We'll concentrate only on the print function of awk, since that's the most useful and the least confusing of all the things awk can do. The general form of the awk command is

awk <pattern> '{print <stuff>}' <file>

In this case, stuff is going to be some combination of text, special variables that represent each word in the input line, and perhaps a mathematical operator or two. As awk processes each line of the input file, each word on the line is assigned to variables named $1 (the first word), $2 (the second word), and so on. (The variable $0 contains the entire line.)

Let's start with a file, words.data, that contains these lines:

nail hammer wood
pedal foot car
clown pie circus

Now we'll use the print function in awk to plug the words from each input line into a template, like this:

awk '{print "Hit the",$1,"with your",$2}' words.data
Hit the nail with your hammer
Hit the pedal with your foot
Hit the clown with your pie

Say some of the data in your input file is numeric, as in the grades.data file shown here:

Rogers 87 100 95
Lambchop 66 89 76
Barney 12 36 27

You can perform calculations like this:

awk '{print "Avg for",$1,"is",($2+$3+$4)/3}' grades.data
Avg for Rogers is 94
Avg for Lambchop is 77
Avg for Barney is 25

So far, we haven't specified any value for pattern in these examples, but if you want to exclude lines from being processed, you can enter something like this:

awk /^clown/'{print "See the",$1,"at the",$3}' words.data
See the clown at the circus

Here, we told awk to consider only the input lines that start with clown. Note also that there is no space between the pattern and the print specifier. If you put a space there, awk will think the input file is '{print and will not work. But all this is just the tip of the awk iceberg--entire books have been written on this command. If you are a programmer, try the man awk command.

For more information on the awk command, see the awk manual.

Previous Lesson: Search & Replace
Next Lesson: Finding Files

[ RETURN TO INDEX ]

Comments (most recent first)

Srikrishnan     (02 Sep 2010, 09:23)
To count the number of CTRL+F characters in a file, we are using the below command, but the files are huge (1milion records +) it takes more time to give the count value. (appr 40 minutes)

awk '{cnt+=gsub(//,"&")}END {print cnt}' Sri.dat

Please help on tuning the performance

Thanks in advance
bhagi     (12 Aug 2010, 01:18)
hi rayen

to print the sum of all the first elements then
awk '{sum+=$1} END {print "sum is",sum}' num.data
i think it is simpler......
balaji     (08 Aug 2010, 09:33)
thanx, it is very easy to understand
Anil kumar     (08 Aug 2010, 00:57)
nice explanation !!.......great work..
bhanu     (04 Aug 2010, 16:12)
How can i print next row first element in a file .
Satish Mongam     (30 Jul 2010, 07:13)
It's very useful and easy to understand.
rayen     (14 Jul 2010, 12:17)
@Bob
That's perfect.
thank you so much again
Bob Rankin     (14 Jul 2010, 12:11)
In this case, the cut command splits on the specified delimiter (space) and returns only the first field (f1).

See the cut command help: http://lowfatlinux.com/linux-columns-cut.html
rayen     (14 Jul 2010, 12:02)
It works, thanks a lot Bob
Could you explain what "cut -d' ' -f1" does.
Thanks a lot again
Bob Rankin     (14 Jul 2010, 10:25)
@rayen - Try this:

cut -d' ' -f1 | awk '{sum+=$1} END {print sum}'
rayen     (14 Jul 2010, 09:58)
HI,
How can I have the sum of all the $1
the sum of the first num off each line.

regards
Esra     (05 Jul 2010, 05:59)
thanks, its been very useful
Nick     (20 Jun 2010, 22:19)
Gustavo,

I think the below might do what you want

awk '!/^mytest/''{print $1}' test.txt

gaurav     (17 Jun 2010, 04:54)
Thanks great explanation!
Gustavo     (28 May 2010, 08:15)
HI
how I can remove or edit text lines in a file with a specific content

for example in a file called test.txt I remove the text "mytest"

-----------------
myfirsttest
is not my tsxt
my test is
mytest
thanks
-----------------
Sishui     (10 May 2010, 20:30)
Thnx great explenation!!
surendar     (06 May 2010, 07:50)
its gud . simple and easily understandable
sreedhar     (05 May 2010, 05:23)
Thanks...for simple but effective illustrations
Magesh     (05 May 2010, 00:01)
This article has made me to understand the basics of awk command. Really a usefull article. Thank you sir.
Soji Antony     (25 Apr 2010, 15:02)
thanks......for ur valuable information
vivek koul     (09 Apr 2010, 07:08)
what is a filter in linux
Gautam     (27 Mar 2010, 12:53)
Gr8 snapshot of awk command.
Vinay     (25 Mar 2010, 22:29)
Good Information with examples for better understanding.
mastan     (23 Mar 2010, 01:12)
Gr8 start up of awk
Anu     (18 Mar 2010, 09:14)
at Last i know the command awk Thanks.
Ashwini     (17 Mar 2010, 03:06)
thanks for giving me good information about awk.
Artur     (06 Mar 2010, 07:08)
Thanks!
Stephan Reiner     (03 Mar 2010, 05:27)
Very good intro, thanks for putting it all together!
Rajesh     (02 Feb 2010, 02:54)
Great solution for my search.

*Name:
Email:
Notify me about new comments on this page
Hide my email
*Text:
 

Ask Bob Rankin - Free Tech Support
<Send This Link to a Friend>         <Bookmark This Page>


Copyright © by Bob Rankin
All rights reserved - Redistribution is allowed only with permission.