The basic function of awk is to
search files for lines or other text units containing one or more patterns.
When a line matches one of the patterns, special actions are performed on that
line.
Programs in awk are
different from programs in most other languages, because awk programs
are
"data-driven":
you describe the data you want to work with and then what to do when you find
it. Most other languages are "procedural." You have to describe, in
great detail, every step the program is to take. When working with procedural
languages, it is usually much harder to clearly describe the data your program
will process. For this reason, awk programs are often refreshingly easy to read
and write.
There are several ways to
run awk. If the program is short, it is easiest to
run it on the command line:
awk
PROGRAM inputfile(s)
If multiple changes have
to be made, possibly regularly and on multiple files, it is easier to put the awk
commands in a script.
This is read like this:
awk -f
PROGRAM-FILE inputfile(s)
Printing selected fields
When awk reads
a line of a file, it divides the line in fields based on the specified input
field separator, FS
The variables $1, $2, $3, ...,
$N hold
the values of the first, second, third until the last field of an input line. The
variable $0
(zero)
holds the value of the entire line.
In the output of ls -l,
there are 9 columns. The print statement uses these
fields as follows:
ls
-l | awk '{ print $5, $9 }'
4096 jenkins_upgrade
57 shellPractice
120 venky.sh
4096 wcpjars
Without formatting, using
only the output separator, the output looks rather poor. Inserting a couple of
tabs and a string to indicate what output this is will make it look a lot
better:
ls
-ldh * | grep -v total |
awk '{ print "Size is " $5 "
bytes for " $9 }'
Size
is 4.0K bytes for jenkins_upgrade
Size
is 57 bytes for shellPractice
Size
is 120 bytes for venky.sh
Size
is 4.0K bytes for wcpjars
df
-h | sort -rnk 5 |
head -3 |
awk '{ print "Partition " $6
"\t: " $5 " full!" }'
Partition
/boot : 39% full!
Partition
/ : 10% full!
Partition
/home : 3% full!
\n Newline character
\t Tab
The print command and regular expressions
awk
'EXPRESSION { PROGRAM }' file(s)
df -h | awk
'/dev/ { print $6 "\t: " $5 }'
/ : 10%
/dev : 0%
/dev/shm : 0%
/home : 3%
/boot : 39%
ls -l | awk
'/\<(s|u|O).*\.jar$/ { print $9 }'
OnlineDataModel.jar
sharedserviceslib.jar
utilityframework.jar
In order to precede
output with comments, use the BEGIN statement:
ls
-l | awk 'BEGIN { print "Files
found:\n" } /\<[a|x].*\.conf$/ { print $9 }'
The END statement
can be added for inserting text after the entire input is processed
ls
-l | \
awk
'/\<[a|x].*\.conf$/ { print $9 }
END { print \
"Can I do anything else for you, mistress?" }'
The field separator is
represented by the built-in variable FS.
awk 'BEGIN { FS=":" }
{ print $1 "\t" $5 }' /etc/passwd
root root
bin bin
daemon daemon
adm adm
lp lp
The output from an entire
print statement is called an output
record. Each print command results in one
output record, and then
outputs a string called the output record separator, ORS
awk
'BEGIN { OFS=";" ;
ORS="\n-->\n" } { print $1,$2}' test
cat
revenues
20021009 20021013
consultancy BigComp 2500
20021015 20021020
training EduComp 2000
20021112 20021123
appdev SmartComp 10000
20021204 20021215
training EduComp 5000
cat
total.awk
{ total=total + $5 }
{ print "Send bill
for " $5 " dollar to " $4 }
END { print
"---------------------------------\nTotal revenue: " total }
awk
-f total.awk revenues
Send bill for 2500
dollar to BigComp
Send bill for 2000
dollar to EduComp
Send bill for 10000
dollar to SmartComp
Send bill for 5000
dollar to EduComp
---------------------------------
Total revenue: 19500
No comments:
Post a Comment