USING SED & AWK UTILTIES

Main Objectives of this Practice Tutorial

Use the sed command to manipulate text contained in a file.

List and explain several instructions associated with the sed command.

Use the sed command as a filter with Linux pipeline commands.

Use the awk command to manipulate text contained in a file.

List and explain several comparison operators and variables associated with the awk command.

Use the awk command as a filter with Linux pipeline commands.

Tutorial Reference Material

Course Notes

Linux Command/Shortcut Reference

YouTube Videos

Course Notes:

PDF | PPTX

Text Manipulation

Man Pages

sed
awk

Brauer Instructional Videos:

KEY CONCEPTS

Using the sed Utility

Usage:

Syntax: sed [-n] 'address instruction' filename

How it Works:

The sed command reads all lines in the input file and will be exposed to the expression�(i.e. area contained within quotes) one line at a time.
The expression can be within single quotes or double quotes.
The expression contains an address (match condition) and an instruction (operation).
If the line matches the address, then it will perform the instruction.

Address:

Can use a line number, to select a specific line (for example: 5)
Can specify a range of line numbers (for example: 5,7)
Can specify a regular expression to select all lines that match �(e.g /^[0-9].*[0-9]$/)
When using regular expressions, you must use forward slash(es) /
If NO address is present, the instruction will apply to ALL lines

Common instructions to take action if text matches an address.

Instruction:

Action to take for matched line(s)
Refer to table on right-side for list of some common instructions
and their purpose

Examples:

sed -n '3 p' text.txt (print 3rd line)
sed -n '1-5 p' text.txt (print lines 1 to 5)
sed -n '4,7 p' text.txt (print only lines 4 and 7)
sed -n'/^Therefore/ p' text.txt (print lines that begin with the patterh "Therefore")

sed '/^#/ d' myscript.bash (remove comments at beginning of line)
sed '/exit/ q' text.txt (print all lines until line containing pattern: "exit")
sed 's/this/that/' text.txt (replace first occurrence of pattern "this" with "that")
sed 's/this/that/' text.txt (replace ALL occurrences of pattern "this" with "that")

Using the awk Utility

Usage:

awk options 'selection-criteria {action}’ file-name

Notes:

The awk command reads all lines in the input file and will be exposed to the expression (contained within quotes) for processing.
Expression (contained in quotes) represents selection criteria, and action to execute (contained within braces) if selection criteria is matched
If no pattern is specified, awk selects all lines in the input
If no action is specified, awk copies the selected lines to standard output
You can use parameters like $1, $2 to represent first field, second field, etc.
You can use the -F option with the awk command to specify the field delimiter.

Patterns: Regular Expressions

You can use a regular expression, enclosed within slashes, as a pattern.
The ~ operator tests whether a field or variable matches a regular expression
The !~ operator tests for no match.
You can perform both numeric and string comparisons using relational operators
You can combine any of the patterns using the Boolean operators || (OR) and && (AND)
You can use built-in variables (like NR or "record number" representing line number) with comparison operators

Comparison operators used with the awk command.

Patterns: Relational Operators

The following operators (in the table on the right-side) can be used with the awk utility to pattern searching.
Since those symbols are used within the expression, they are NOT confused with redirection symbols.

Examples:

awk 'NR == 3 {print}' text.txt (print 3rd line)
awk 'NR >= 1 && NR <= 5 {print}' text.txt (print lines 1 to 5)
awk '/NOTE:/ {print]' text.txt (print lines that contain the pattern: "NOTE:")

awk -F";" '$1 ~ /ford/ {print}' cars.dat (print records (of semi-colon delimited file) whose 1st field matches: "ford")
awk -F";" '$1 ~ /ford/ {print $2,$4}' cars.dat (same as above, but only print 2nd and 4th fields)

INVESTIGATION 1: USING THE SED UTILITY

In this section, you will learn how to manipulate text using the sed utility.

Perform the Following Steps:

Login to your matrix account.
Issue a Linux command to confirm you are located in your home directory.
Issue a Linux command to create a directory called sed
Issue a Linux command to change to the sed directory.
Issue a Linux command to confirm you are located in the sed directory.
Issue the following linux command (copy and paste to save time):
wget https://ict.senecacollege.ca/~murray.saul/uli101/data.txt
Issue the more command to quickly view the contents of the data.txt file.
When finished, exit the more command by pressing the letter q
The p command in sed is used to print or display the contents of a text file.
Issue the following linux command:
sed 'p' data.txt

You should notice that each line appears twice. The reason why standard output appears twice is that the sed command (without the -n option) displays all lines regardless if they had been specified as a pattern.
Issue the following linux pipeline command:
sed -n 'p' data.txt | tee sed-1.txt

What do you notice?

You can specify an address (line #, line #s or range of line #s) when using the sed utility.
Issue the following linux pipeline command:
sed -n '1 p' data.txt | tee sed-2.txt

You should see the first line of the text file displayed.
Issue the following linux pipeline command:
sed -n '2,5 p' data.txt | tee sed-3.txt

What is displayed? How do you change command to display lines 2 to 5?

The s command is used to substitute patterns (similar to method demonstratedin vi editor).
Issue the following linux pipeline command:
sed '2,5 s/TUTORIAL/LESSON/g' data.txt | tee sed-4.txt

What do you notice? View the original contents of lines 2 to 5 in the data.txt file in another shell to confirm that the substitution occurred.

The q command terminates or quits the execution of the sed utility as soon as it read in a particular line or matching pattern.
Issue the following linux pipeline command:
sed '11 q' data.txt | tee sed-5.txt

What did you notice?

You can use regular expressions to select lines that match a pattern. The rules remain the same for using regular expressions as demonstrated in lab8 except the regular expression must be contained within delimiters such as the forward slash "/" when using the sed utility.
Issue the following linux pipeline command:
sed -n '/^The/ p' data.txt | tee sed-6.txt

What do you notice?
Issue the following linux pipeline command:
sed -n '/d$/ p' data.txt | tee sed-7.txt

What do you notice?

The sed utility can also be used as a filter to manipulate text that was generated from linux commands.
Issue the following linux pipeline command:
ls | sed -n '/txt$/ p' | tee sed-8.txt

What did you notice?
Issue the following linux pipeline command:
who | sed -n '/^[a-m]/ p' | tee sed-9.txt | more

What did you notice?
Issue the following to run a checking script:
bash /home/murray.saul/scripts/week11-check-1
If you encounter errors, make corrections and re-run the checking script until you
receive a congratulations message, then you can proceed.

In the next investigation, you will learn how to manipulate text using the awk utility.

INVESTIGATION 2: USING THE AWK UTILITY

In this section, you will learn how to use the awk utility to manipulate text and generate reports.

Perform the Following Steps:

Change to your home directory and issue a command to confirm you are located
in your home directory.
Issue a Linux command to create a directory called awk
Issue a Linux command to change to the awk directory.
Issue a Linux command to confirm you are located in the awk directory.
Issue the following linux command (copy and paste to save time):
wget https://ict.senecacollege.ca/~murray.saul/uli101/cars.txt
Issue the more command to quickly view the contents of the cars.txt file.
When finished, exit the more command by pressing the letter q

The "print" action (command) is the default action of awk to print all selected lines that match a pattern.
This action (contained in braces) can provide more options such as printing specific fields of selected lines (or records) from a database.
Issue the following linux command all to display records in the "cars.txt" database that contain the make "ford":
awk '/ford/ {print}' cars.txt
Issue the following linux pipeline command all to display records in the "cars.txt" database that contain the make "ford":
awk '/ford/' cars.txt | tee awk-1.txt

What do you notice?

You can use variables with the "print" action for further processing. We will discuss the following variables in this tutorial:

$0 - Current record (entire line)
$1 - First field in record
$n - nth field in record
NR - Record Number (order in database)
NF - Number of fields in current record

For a listing of more variables, please consult your course notes.

The tilde character ~ is used to search for a pattern or display standard output for a particular field.
Issue the following linux pipeline command to display the model, year, quantity and price in the "cars.txt" database for makes of "chevy":
awk '/chevy/ {print $2,$3,$4,$5}' cars.txt | tee awk-2.txt

Notice that a space " " is the delimiter for the fields that appear as standard output.
Issue the following linux pipeline command to display all plymouths (plyms) by model name, price and quantity:
awk '/chevy/ {print $2,$3,$4,$5}' cars.txt | tee awk-3.txt

You can also use comparison operators to specify conditions for processing with matched patterns when using the awk command. Since they are used WITHIN the awk expression, they are not confused with redirection symbols

Comparison Operators:

<     Less than
<=   Less than or equal
>     Greater than
>=   Greater than or equal
==   Equal
!=    Not equal
Issue the following linux pipeline command to display display the car make, model number, quantity and price of all vehicles that are prices less than $5,000:
awk '$5 < 5000 {print $1,$2,$4,$5}' cars.txt | tee awk-4.txt

What do you notice?
Issue the following linux pipeline command to display display the car make, model number, quantity and price of all vehicles that are prices less than $5,000:
awk '$5 < 5000 {print $1,$2,$4,$5}' cars.txt | tee awk-5.txt

The symbol tilde ~ is used to match a pattern for a particular field number.
Issue the following linux pipeline command to display the car make, year and quantity of all car makes that begin with the letter 'f':
awk '$1 ~ /^f/ {print $1,$2,$4}' cars.txt | tee awk-6.txt

Compound criteria symbols can be used to join search statements together

Compound Operators:

&& (and)
|| (or)
Issue the following linux pipeline command to list all "fords" that are greater than $10,000 in price:
awk '$1 ~ /ford/ && $5 > 10000 {print $0}' cars.txt | tee awk-7.txt
Issue the following to run a checking script:
bash /home/murray.saul/scripts/week11-check-2
If you encounter errors, make corrections and re-run the checking script until you
receive a congratulations message, then you can proceed.

After you complete the Review Questions sections to get additional practice, then work on your online assignment 3,
sections 4 to 6 labelled: More Scripting (add), Yet More Scripting (oldfiles), and sed And awk

LINUX PRACTICE QUESTIONS

The purpose of this section is to obtain extra practice to help with quizzes, your midterm, and your final exam.

Here is a link to the MS Word Document of ALL of the questions displayed below but with extra room to answer on the document to simulate a quiz:

https://ict.senecacollege.ca/~murray.saul/uli101/uli101_week11_practice.docx

Your instructor may take-up these questions during class. It is up to the student to attend classes in order to obtain the answers to the following questions. Your instructor will NOT provide these answers in any other form (eg. e-mail, etc).

Review Questions:

Part A: Display Results from Using the sed Utility

Note the contents from the following tab-delimited file called ~murray.saul/uli101/stuff.txt: (this file pathname exists for checking your work)

Line one.
This is the second line.
This is the third.
This is line four.
Five.
Line six follows
Followed by 7
Now line 8
and line nine
Finally, line 10

Write the results of each of the following Linux commands for the above-mentioned file:

sed -n '3,6 p' ~murray.saul/uli101/stuff.txt
sed '4 q' ~murray.saul/uli101/stuff.txt
sed '/the/ d' ~murray.saul/uli101/stuff.txt
sed 's/line/NUMBER/g' ~murray.saul/uli101/stuff.txt

Part B: Writing Linux Commands Using the sed Utility

Write a single Linux command to perform the specified tasks for each of the following questions.

Write a Linux sed command to display only lines 5 to 9 for the file: ~murray.saul/uli101/stuff.txt
Write a Linux sed command to display only lines the begin the pattern “and” for the file: ~murray.saul/uli101/stuff.txt
Write a Linux sed command to display only lines that end with a digit for the file: ~murray.saul/uli101/stuff.txt
Write a Linux sed command to save lines that match the pattern “line” (upper or lowercase) for the file: ~murray.saul/uli101/stuff.txt and save results (overwriting previous contents) to: ~/results.txt

Part C: Writing Linux Commands Using the awk Utility

Note the contents from the following tab-delimited file called ~murray.saul/uli101/stuff.txt: (this file pathname exists for checking your work)

Line one.
This is the second line.
This is the third.
This is line four.
Five.
Line six follows
Followed by 7
Now line 8
and line nine
Finally, line 10

Write the results of each of the following Linux commands for the above-mentioned file:

awk ‘NR == 3 {print}’ ~murray.saul/uli101/stuff.txt
awk ‘NR >= 2 && NR <= 5 {print}’ ~murray.saul/uli101/stuff.txt
awk ‘$1 ~ /This/ {print $2}’ ~murray.saul/uli101/stuff.txt
awk ‘$1 ~ /This/ {print $3,$2}’ ~murray.saul/uli101/stuff.txt

Part D: Writing Linux Commands Using the awk Utility

Write a single Linux command to perform the specified tasks for each of the following questions.

Write a Linux awk command to display all records for the file: ~/cars whose fifth field is greater than 10000.
Write a Linux awk command to display the first and fourth fields for the file: ~/cars whose fifth field begins with a number.
Write a Linux awk command to display the second and third fields for the file: ~/cars for records that match the pattern “chevy”.
Write a Linux awk command to display the first and second fields for all the records contained in the file: ~/cars

Tutorial11: Sed & Awk Utilities

Contents

USING SED & AWK UTILTIES

Main Objectives of this Practice Tutorial

Tutorial Reference Material

KEY CONCEPTS

Using the sed Utility

Using the awk Utility

INVESTIGATION 1: USING THE SED UTILITY

INVESTIGATION 2: USING THE AWK UTILITY

LINUX PRACTICE QUESTIONS

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

get involved with CDOT

courses

course projects

links

Tools