Open main menu

CDOT Wiki β

Changes

Tutorial10: Sed & Awk Utilities

5 bytes added, 16:53, 8 May 2022
Updates slides and ICT file links to GitHub repo.
|- valign="top" style="padding-left:15px;"
|colspan="2" |'''Slides''':<ul><li>Week 11 Lecture 1 Notes:<br> [[Mediahttps://github.com/ULI101/slides/raw/main/ULI101-Week11.1.pdf | PDF]] | [https://matrixgithub.senecacollege.cacom/ULI101/slides/~chris.johnsonraw/ULI101main/ULI101-Week11.1.pptx PPTX]</li><li>Week 11 Lecture 2 Notes:<br> [[Mediahttps://github.com/ULI101/slides/raw/main/ULI101-Week11.2.pdf | PDF]] | [https://matrix.senecacollegegithub.cacom/~jason.carmanULI101/slides/raw/main/ULI101-Week11.2.pptx PPTX] <br></li></ul>
# Issue a Linux command to create a directory called '''sed'''<br><br>
# Issue a Linux command to <u>change</u> to the '''sed''' directory and confirm that you are located in the '''sed''' directory.<br><br>
# Issue the following Linux command to download the data.txt file<br>('''copy and paste''' to save time):<br><span style="color:blue;font-weight:bold;font-family:courier;">wget <nowiki>https://ictgithub.senecacollege.cacom/ULI101/labs/~murray.saulraw/uli101main/data.txt</nowiki></span><br><br>
# Issue the '''more''' command to quickly view the contents of the '''data.txt''' file.<br>When finished, exit the more command by pressing the letter <span style="color:blue;font-weight:bold;font-family:courier;">q</span>[[Image:sed-1.png|thumb|right|300px|Issuing the '''p''' instruction without using the '''-n''' option (to suppress original output) will display lines twice.]]<br><br>The '''p''' instruction with the '''sed''' command is used to<br>'''print''' (i.e. ''display'') the contents of a text file.<br><br>
# Issue the following Linux command:<br><span style="color:blue;font-weight:bold;font-family:courier;">sed 'p' data.txt</span><br><br>'''NOTE: You should notice that each line appears twice'''.<br><br>The reason why standard output appears twice is that the sed command<br>(without the '''-n option''') displays all lines regardless of an address used.<br><br>We will use '''pipeline commands''' to both display stdout to the screen and save to files<br>for <u>confirmation</u> of running these pipeline commands when run a '''checking-script''' later in this investigation.<br><br>
# Issue a Linux command to create a directory called '''awk'''<br><br>
# Issue a Linux command to <u>change</u> to the '''awk''' directory and confirm you are located in the '''awk''' directory.<br><br>Let's download a database file that contains information regarding classic cars.<br><br>
# Issue the following linux command ('''copy and paste''' to save time):<br><span style="color:blue;font-weight:bold;font-family:courier;">wget <nowiki>https://ictgithub.senecacollege.cacom/ULI101/labs/~murray.saulraw/uli101main/cars.txt</nowiki></span><br><br>
# Issue the '''cat''' command to quickly view the contents of the '''cars.txt''' file.<br><br>The "'''print'''" action (command) is the <u>default</u> action of awk to print<br>all selected lines that match a '''pattern'''.<br><br>This '''action''' (contained in braces) can provide more options<br>such as printing '''specific fields''' of selected lines (or records) from a database.<br><br>[[Image:awk-1.png|thumb|right|400px|Using the awk command to display matches of the pattern '''ford'''.]]
# Issue the following linux command all to display all lines (i.e. records) in the '''cars.txt''' database that matches the pattern (or "make") called '''ford''':<br><span style="color:blue;font-weight:bold;font-family:courier;">awk '/ford/ {print}' cars.txt</span><br><br>We will use '''pipeline commands''' to both display stdout to the screen and save to files for <u>confirmation</u> of running these pipeline commands when run a '''checking-script''' later in this investigation.<br><br>
# Issue the following linux pipeline command to display the '''car make''',<br>'''year''' and '''quantity''' of cars that '''begin''' with the '''letter 'f'''':<br><span style="color:blue;font-weight:bold;font-family:courier;">awk '$1 ~ /^f/ {print $1,$2,$4}' cars.txt | tee awk-6.txt</span><br><br>[[Image:awk-4.png|thumb|right|400px|Using the awk command to display combined search results based on '''compound operators'''.]]Combined pattern searches can be made<br>by using '''compound operator''' symbols:<br><br>'''&&''' &nbsp;&nbsp;&nbsp;&nbsp;(and)<br>'''||''' &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(or)<br><br>
# Issue the following linux pipeline command to list all '''fords'''<br>whose '''price is greater than $10,000''':<br><span style="color:blue;font-weight:bold;font-family:courier;">awk '$1 ~ /ford/ && $5 > 10000 {print $0}' cars.txt | tee awk-7.txt</span><br><br>
# Issue the following linux command ('''copy and paste''' to save time):<br><span style="color:blue;font-weight:bold;font-family:courier;">wget <nowiki>https://ictgithub.senecacollege.cacom/ULI101/labs/~murray.saulraw/uli101main/cars2.txt</nowiki></span><br><br>
# Issue the '''cat''' command to quickly view the contents of the '''cars2.txt''' file.<br><br>
# Issue the following linux pipeline command to display the '''year'''<br>and '''quantity''' of cars that '''begin''' with the '''letter 'f'''' for the '''cars2.txt''' database:<br><span style="color:blue;font-weight:bold;font-family:courier;">awk '$1 ~ /^f/ {print $2,$4}' cars2.txt | tee awk-8.txt</span><br><br>What did you notice?<br><br>The problem is that the '''cars2.txt''' database separates each field by a semi-colon (''';''') <u>instead</u> of '''TAB'''.<br>Therefore, it does not recognize the second and fourth fields.<br><br>You need to issue awk with the -F option to indicate that this file's fields are separated (delimited) by a semi-colorn.<br><br>
simulate a quiz:
https://ictgithub.senecacollege.cacom/ULI101/labs/~murray.saulraw/uli101main/uli101_week11_practice.docx
Your instructor may take-up these questions during class. It is up to the student to attend classes in order to obtain the answers to the following questions. Your instructor will NOT provide these answers in any other form (eg. e-mail, etc).