Open main menu

CDOT Wiki β

Changes

Tutorial10: Sed & Awk Utilities

345 bytes added, 20:36, 4 September 2023
no edit summary
{{Admon/caution|DO NOT USE THIS VERSION OF THE LAB. This page will no longer be updated.|'''New version here:''' https://seneca-ictoer.github.io/ULI101/A-Tutorials/tutorial10<br />'''Andrew's students please go here:''' http://wiki.littlesvr.ca/wiki/OPS145_Lab_9}}
=USING SED &amp; AWK UTILTIES=
<br>
|- valign="top" style="padding-left:15px;"
|colspan="2" |'''Slides''':<ul><li>Week 11 10 Lecture 1 Notes:<br> [https://githubwiki.comcdot.senecacollege.ca/ULI101uli101/slides/raw/main/ULI101-Week1110.1.pdf PDF] | [https://githubwiki.cdot.comsenecacollege.ca/ULI101uli101/slides/raw/main/ULI101-Week1110.1.pptx PPTX]</li><li>Week 11 10 Lecture 2 Notes:<br> [https://githubwiki.cdot.comsenecacollege.ca/ULI101uli101/slides/raw/main/ULI101-Week1110.2.pdf PDF] | [https://githubwiki.cdot.comsenecacollege.ca/ULI101uli101/slides/raw/main/ULI101-Week1110.2.pptx PPTX] <br></li></ul>
=INVESTIGATION 1: USING THE SED UTILITY=
<span style="color:red;">'''ATTENTION''': Effective '''May 9, 2022''' - this This online tutorial will be required to be completed by '''Friday in week 11 by midnight'''<br>to obtain a grade of '''2%''' towards this course</span><br><br>
In this investigation, you will learn how to manipulate text using the '''sed''' utility.
# Issue a Linux command to create a directory called '''sed'''<br><br>
# Issue a Linux command to <u>change</u> to the '''sed''' directory and confirm that you are located in the '''sed''' directory.<br><br>
# Issue the following Linux command to download copy the data.txt file<br>('''copy and paste''' to save time):<br><span style="color:blue;font-weight:bold;font-family:courier;">wget <nowiki>https://github.com/ULI101/labs/rawcp ~uli101/maintutorialfiles/data.txt<~/nowiki>sed</span><br><br>
# Issue the '''more''' command to quickly view the contents of the '''data.txt''' file.<br>When finished, exit the more command by pressing the letter <span style="color:blue;font-weight:bold;font-family:courier;">q</span>[[Image:sed-1.png|thumb|right|300px|Issuing the '''p''' instruction without using the '''-n''' option (to suppress original output) will display lines twice.]]<br><br>The '''p''' instruction with the '''sed''' command is used to<br>'''print''' (i.e. ''display'') the contents of a text file.<br><br>
# Issue the following Linux command:<br><span style="color:blue;font-weight:bold;font-family:courier;">sed 'p' data.txt</span><br><br>'''NOTE: You should notice that each line appears twice'''.<br><br>The reason why standard output appears twice is that the sed command<br>(without the '''-n option''') displays all lines regardless of an address used.<br><br>We will use '''pipeline commands''' to both display stdout to the screen and save to files<br>for <u>confirmation</u> of running these pipeline commands when run a '''checking-script''' later in this investigation.<br><br>
# Issue a Linux command to create a directory called '''awk'''<br><br>
# Issue a Linux command to <u>change</u> to the '''awk''' directory and confirm you are located in the '''awk''' directory.<br><br>Let's download a database file that contains information regarding classic cars.<br><br>
# Issue the following linux command ('''copy and paste''' to save time):<br><span style="color:blue;font-weight:bold;font-family:courier;">wget <nowiki>https:/cp ~uli101/github.com/ULI101/labs/raw/maintutorialfiles/cars.txt<~/nowiki>awk</span><br><br>
# Issue the '''cat''' command to quickly view the contents of the '''cars.txt''' file.<br><br>The "'''print'''" action (command) is the <u>default</u> action of awk to print<br>all selected lines that match a '''pattern'''.<br><br>This '''action''' (contained in braces) can provide more options<br>such as printing '''specific fields''' of selected lines (or records) from a database.<br><br>[[Image:awk-1.png|thumb|right|400px|Using the awk command to display matches of the pattern '''ford'''.]]
# Issue the following linux command all to display all lines (i.e. records) in the '''cars.txt''' database that matches the pattern (or "make") called '''ford''':<br><span style="color:blue;font-weight:bold;font-family:courier;">awk '/ford/ {print}' cars.txt</span><br><br>We will use '''pipeline commands''' to both display stdout to the screen and save to files for <u>confirmation</u> of running these pipeline commands when run a '''checking-script''' later in this investigation.<br><br>
# Issue the following linux pipeline command to display the '''car make''',<br>'''year''' and '''quantity''' of cars that '''begin''' with the '''letter 'f'''':<br><span style="color:blue;font-weight:bold;font-family:courier;">awk '$1 ~ /^f/ {print $1,$2,$4}' cars.txt | tee awk-6.txt</span><br><br>[[Image:awk-4.png|thumb|right|400px|Using the awk command to display combined search results based on '''compound operators'''.]]Combined pattern searches can be made<br>by using '''compound operator''' symbols:<br><br>'''&&''' &nbsp;&nbsp;&nbsp;&nbsp;(and)<br>'''||''' &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(or)<br><br>
# Issue the following linux pipeline command to list all '''fords'''<br>whose '''price is greater than $10,000''':<br><span style="color:blue;font-weight:bold;font-family:courier;">awk '$1 ~ /ford/ && $5 > 10000 {print $0}' cars.txt | tee awk-7.txt</span><br><br>
# Issue the following linux command ('''copy and paste''' to save time):<br><span style="color:blue;font-weight:bold;font-family:courier;">wget <nowiki>https:/cp ~uli101/github.com/ULI101/labs/raw/maintutorialfiles/cars2.txt<~/nowiki>awk</span><br><br>
# Issue the '''cat''' command to quickly view the contents of the '''cars2.txt''' file.<br><br>
# Issue the following linux pipeline command to display the '''year'''<br>and '''quantity''' of cars that '''begin''' with the '''letter 'f'''' for the '''cars2.txt''' database:<br><span style="color:blue;font-weight:bold;font-family:courier;">awk '$1 ~ /^f/ {print $2,$4}' cars2.txt | tee awk-8.txt</span><br><br>What did you notice?<br><br>The problem is that the '''cars2.txt''' database separates each field by a semi-colon (''';''') <u>instead</u> of '''TAB'''.<br>Therefore, it does not recognize the second and fourth fields.<br><br>You need to issue awk with the -F option to indicate that this file's fields are separated (delimited) by a semi-colorn.<br><br>
simulate a quiz:
https://githubwiki.com/ULI101/labscdot.senecacollege.ca/rawuli101/mainfiles/uli101_week11_practice.docx
Your instructor may take-up these questions during class. It is up to the student to attend classes in order to obtain the answers to the following questions. Your instructor will NOT provide these answers in any other form (eg. e-mail, etc).
# Write a Linux awk command to display the second and third fields for the file: '''~/cars''' for records that match the pattern “chevy”.<br><br>
# Write a Linux awk command to display the first and second fields for all the records contained in the file: '''~/cars'''<br><br>
 
 
_________________________________________________________________________________
 
Author: Murray Saul
 
License: LGPL version 3
Link: https://www.gnu.org/licenses/lgpl.html
 
_________________________________________________________________________________