Open main menu

CDOT Wiki β

Changes

Tutorial9: Regular Expressions

35 bytes removed, 12:10, 27 February 2021
INVESTIGATION 1: SIMPLE & COMPLEX REGULAR EXPRESSIONS
# Issue the '''ls''' command to confirm that the text file was downloaded.<br><br>
# View the contents of the '''textfile1.txt''' file using the '''more''' command and quickly view the contents of this file.<br><br>Although there are several Linux commands that use regular expressions,<br>we will only be using the '''grep''' command for this investigation.<br><br>
#Issue the following linux pipeline command to match the pattern the within '''textfile1.txt''':<br><span style="color:blue;font-weight:bold;font-family:courier;">grep "the" textfile1.txt | more<br><br># Now, issue the grep linux pipeline command with the '''-i''' option to ignore case sensitively:<br><span style="color:blue;font-weight:bold;font-family:courier;">grep -i "the" textfile1.txt | more</span><br><br>What do you notice is different with this pipeline command?<br><br>You will notice that the pattern "the" is matched including larger words that contain the pattern "the". You can use the -w option with the grep command in order to just match only words for a pattern.<br><br># Issue the following linux pipeline command:<br><span style="color:blue;font-weight:bold;font-family:courier;">grep -w -i "the" textfile1.txt | more</span><br><br>You should now see only strings of text that match the word '''"the"'''.<br><br>Matching literal or simple regular expressions can be useful, but are limited in what they can assist with pattern matching.<br>For Example, you may want to search for pattern at a specific location within the string of text (like at the beginning or end of the string).<br><br>There are other regular expression tools to provide more precise matches. These tools are '''complex''' and '''extended''' regular expressions. We will now look at complex regular expression symbols now, and we will discuss ''extended regular expressions''''''Italic text'''' in the next section of this tutorial.<br><br># Issue the following Linux pipeline command:<br><span style="color:blue;font-weight:bold;font-family:courier;">grep -w -i "^the" textfile1.txt | more</span><br><br>The '''^''' symbol is an anchor. In this case, it only matches the <u>word</u> "the" (both upper or lowercase) at the beginning of strings.<br>The '''$''' symbol is used to anchor patterns at the end of strings.<br><br># Issue the following Linux pipeline command:<br><span style="color:blue;font-weight:bold;font-family:courier;">grep -w -i "the$" textfile1.txt | more</span><br><br>What do you notice?<br><br>
# Issue the following Linux pipeline command to anchor the work "the" simultaneously at the beginning and the end of the string:<br><span style="color:blue;font-weight:bold;font-family:courier;">grep -w -i "^the$" textfile1.txt | more</span><br><br>What do you notice?<br><br>Anchoring patterns at both the <u>beginning</u> and <u>ending</u> of strings can greatly assist for more robust search patterns.<br>We will now be demonstrating '''simultaneous anchoring''' with other complex regular expressions symbols.<br><br>
# Issue the following command to match strings that begin with 3 characters:<br><span style="color:blue;font-weight:bold;font-family:courier;">grep "^..." textfile1.txt | more</span><br><br>What do you notice?<br><br>
13,420
edits