Open main menu

CDOT Wiki β

Changes

Tutorial9: Regular Expressions

12 bytes added, 10:16, 17 July 2020
INVESTIGATION 2: EXTENDED REGULAR EXPRESSIONS
# Issue the following linux command to download another data file called words.dat:<br><span style="color:blue;font-weight:bold;font-family:courier;">wget <nowiki>https://ict.senecacollege.ca/~murray.saul/uli101/words.dat</nowiki></span><br><br>
# View the contents of the '''numbers2.dat''' file using the '''more''' command and quickly view the contents of this file.<br>You should notice valid and more invalid numbers contained in this file. When finished, exit the more command.<br><br>
# Issue the following linux command to display two or more occurrences of the word "the":<br><span style="color:blue;font-weight:bold;font-family:courier;">egrep -i "(the){2,}" words.dat | more</span><br><br>'''NOTE:''' You should not NOT see any output due to the fact that a space should be included at the end of the word "the". Usually words are separated by spaces; therefore, there were no matches since there were not occurrences of "thethe" as opposed to "the the"<br><br>
# Reissue the previous command including a space in brackets:<br><span style="color:blue;font-weight:bold;font-family:courier;">egrep -i "(the ){2,}" words.dat | more</span><br><br>The or symbol | can be used within the grouping regular expression symbol to allow matching of additional groups of characters. Again, it is important to follow the character groupings with the space character<br><br>
# Issue the following linux command to search for 2 or more occurrences of the word "the" or the word "and":<br><span style="color:blue;font-weight:bold;font-family:courier;">egrep -i "(the |and ){2,}" words.dat | more</span><br><br>
13,420
edits