198
edits
Changes
no edit summary
:The first investigation of this lab we will be working with different data structures. These are stored in a similar way to variables and lists, however they can contain a lot more information and are designed for specific purposes. Each structure has its own advantages and disadvantages, this lab will emphasize where those important differences lay. The second investigation will focus closely on strings. We have been using and storing strings since our first class, however in this lab we will dive into the more complex nature of string manipulation. Finally, this lab will cover how to use a variety of different regular expression functions, for searching and input validation.
=== Python Reference =PYTHON REFERENCE ==For additional reference while working through this course. ::'''Tuples'''::*https://docs.python.org/3/tutorial/datastructures.html#tuples-and-sequences ::'''Sets'''::*https://docs.python.org/3/tutorial/datastructures.html#sets ::'''Dictionaries'''::*https://docs.python.org/3/tutorial/datastructures.html#dictionaries ::'''Lists and List Comprehension'''::*https://docs.python.org/3/tutorial/introduction.html#lists::*https://docs.python.org/3/tutorial/datastructures.html#more-on-lists::*https://docs.python.org/3/tutorial/datastructures.html#list-comprehensions ::'''Strings'''::*https://docs.python.org/3/tutorial/introduction.html#strings::*https://docs.python.org/3/library/string.html ::'''Regular Expressions:'''::*https://docs.python.org/3/library/re.html::*https://docs.python.org/3/howto/regex.html
= INVESTIGATION 1: DATA STRUCTURES =
== PART 1 - Tuple ==
:A Python Tuple is a number of immutable Python values. This is similar to a list in a lot of ways, except that, you cannot change the values value insidecannot be changed.:'''Perform the Following Steps:''':#Start by opening the ipython3 shell<presource>
ipython
</source>
:#Create two tuples to experiment with<source>
t1 = ('Prime', 'Ix', 'Secundus', 'Caladan')
t2 = (1, 2, 3, 4, 5, 6)
</presource> :#Values from a tuple can be retreived in the same way as a list.<presource>
t1[0]
t2[2:4]
</presource> :#Or check if a value exists inside a tuple.<presource>
'Ix' in t1
'Geidi' in t1
</presource> :#Try changing a tuple value.<presource>
t2[1] = 10
</presource> :#Did it work? Once created the tuple values will not be able to change. If you would like a tuple with different values than the tuple you currently have, you must create a new one.<presource>
t3 = t2[2:3]
</presource> :#You however can still use most of the basic operations you might expect from tuples.<presource>
len(t1) # list the length of the tuple
t1 * 3 # repitition
t1 + t2 # concatenation, remember this is creating a new tuple, not modifying
</presource> :#Like lists, you can also loop through the values of tuples.<presource>
for item in t1:
print('item: ' + item)
</presource>
== PART 2 - Set ==
:Sets are another very similar structure to lists, they can also be modified and changed, unlike the tuple. But sets have two unique characteristics, they are unordered, and they cannot have duplicate values. The unordered part provides a added performance from hashing the values, but also means we cannot pull out a specific value at a spefici position. Any duplicate entries will immediately be deleted. Sets however are great tools for doing comparisons, finding differences in multiple sets, or finding similarities. The best part about sets are, they are fast!
s1 = {'Prime', 'Ix', 'Secundus', 'Caladan'}
s2 = {1, 2, 3, 4, 5}
s3 = {4, 5, 6, 7, 8}
</presource> :#First, try accessing access a set through the index.<presource>
s1[0]
</presource> You :#This should have received a created an error, this is not how you to access data inside a set because they are unordered. Instead you can check to see if a value is inside.<presource>
'Ix' in s1
'Geidi' in s1
</presource> If you would like to combine sets :#Sets can be combined together you can. Any , any duplicates that the 2 sets share, will be deleted. Take a close look at which items are shared between the sets.<presource>
s2
s3
s2 | s3 # returns a set containing all values from both sets
s2.union(s3) # same as s2 | s3
</presource> :#Instead of combining sets, we can find out what values are in both sets. This is a intersection between the lists.<presource>
s2
s3
s2 & s3 # returns a set containing all values that s2 and s3 share
s2.intersection(s3) # same as s2 & s3
</presource> Lets look at how we :#Sets can compare the have their values inside compared against other sets. First lets find out what items are in '''s2''' but not in '''s3'''. This is also called a difference. But notice that it only shows values that '''s2''' contains, specifically values that '''s3''' doesn't have. So this isn't really the true difference between the sets.<presource>
s2
s3
s2 - s3 # returns a set containing all values in s2 that are not found s3
s2.difference(s3) # same as s2 - s3
</presource> If we want :#In order to see every difference between both sets, we can find the symmetric difference. This will return a set that shows all numbers that both sets do not share together.<presource>
s2
s3
s2 ^ s3 # returns a set containing all values that both sets DO NOT share
s2.symmetric_difference(s3) # same as s2 ^ s3
</presource> Since these :#These powerful features can be so useful and efficient, you may want to try applying them to lists. There are some added steps required if you want to use Lists cannot perform these functions operations on them, instead we have to convert the listsinto sets. First Perform the comparision then convert the list back to a set. There are two problems with doing this: First, sets are unordered so if the list order is important this will cause problems and remove order, second, sets cannot contain duplicate values, perform if the set comparison or functionlist contains any duplicate values they will be deleted. However, convert if the list does not have any of the set back above requirements this is a great solution to a listsome problems.<presource>
l2 = [1, 2, 3, 4, 5]
l3 = [4, 5, 6, 7, 8]
new_list = list(set(l2).intersection(set(l3))) # set() can make lists into sets. list() can make sets into lists
new_list
</presource> '''Create a Python Script Demonstrating Comparing Sets''':'''Perform the Following Instructions'''::#Create the '''~/ops435/lab4/lab4a.py''' script. The purpose of this script will be to demonstrate the different way of comparing sets. There will be three functions, each returning a different set comparison. ::#Use this template to get started:<source>#!/usr/bin/env python3 def join_sets(set1, set2): # join_sets will return a set that has every value from both set1 and set2 inside it def match_sets(set1, set2): # match_sets will return a set that contains all values found in both set1 and set2 def diff_sets(set1, set2): # diff_sets will return a set that contains all different values which are not shared between the sets if __name__ == '__main__': set1 = set(range(1,10)) set2 = set(range(5,15)) print('set1: ', set1) print('set2: ', set2) print('join: ', join_sets(set1, set2)) print('match: ', match_sets(set1, set2)) print('diff: ', diff_sets(set1, set2)) </source> :::*The match_sets() function should return a set that contains all values found in both sets:::*The diff_sets() function should return a set that contains all values which are not shared between both sets:::*The join_sets() function should return a set that contains all values from both sets:::*All three functions should accept '''two arguments''' both are sets:::*The script should show the exact output as the samples:::*The script should contain no errors::::'''Sample Run 1:'''<source>run lab4a.pyset1: {1, 2, 3, 4, 5, 6, 7, 8, 9}set2: {5, 6, 7, 8, 9, 10, 11, 12, 13, 14}join: {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14}match: {8, 9, 5, 6, 7}diff: {1, 2, 3, 4, 10, 11, 12, 13, 14}</source>::::'''Sample Run 2(with import):'''<source>import lab4aset1 = {1,2,3,4,5}set2 = {2,1,0,-1,-2}lab4a.join_sets(set1,set2){-2, -1, 0, 1, 2, 3, 4, 5}lab4a.match_sets(set1,set2){1, 2}lab4a.diff_sets(set1,set2){-2, -1, 0, 3, 4, 5}</source>:::3. Exit the ipython3 shell, download the checking script and check your work. Enter the following commands from the bash shell.<source>cd ~/ops435/lab4/pwd #confirm that you are in the right directoryls CheckLab4.py || wget matrix.senecac.on.ca/~acoatley-willis/CheckLab4.pypython3 ./CheckLab4.py -f -v lab4a</source>:::4. Before proceeding, make certain that you identify any and all errors in lab4a.py. When the checking script tells you everything is OK before proceeding to the next step. '''Create a Python Script Demonstrating Comparing Lists''':'''Perform the Following Instructions'''::#Create the '''~/ops435/lab4/lab4b.py''' script. The purpose of this script will be to improve the previous script to perform the same joins, matches, and diffs, but this time on lists. ::#Use the following as a template:<source>#!/usr/bin/env python3 def join_lists(list1, list2): # join_lists will return a list that contains every value from both list1 and list2 inside it def match_lists(list1, list2): # match_lists will return a list that contains all values found in both list1 and list2 def diff_lists(list1, list2): # diff_lists will return a list that contains all different values, which are not shared between the lists if __name__ == '__main__': list1 = list(range(1,10)) list2 = list(range(5,15)) print('list1: ', list1) print('list2: ', list2) print('join: ', join_lists(list1, list2)) print('match: ', match_lists(list1, list2)) print('diff: ', diff_lists(list1, list2))</source>:::*The match_lists() function should return a list that contains all values found in both lists:::*The diff_lists() function should return a list that contains all values which are not shared between both lists:::*The join_lists() function should return a list that contains all values from both sets:::*All three functions should accept '''two arguments''' both are lists:::*The script should show the exact output as the samples:::*The script should contain no errors::::'''Sample Run 1:'''<source>run lab4b.pylist1: [1, 2, 3, 4, 5, 6, 7, 8, 9]list2: [5, 6, 7, 8, 9, 10, 11, 12, 13, 14]join: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]match: [8, 9, 5, 6, 7]diff: [1, 2, 3, 4, 10, 11, 12, 13, 14]</source>::::'''Sample Run 2(with import):'''<source>import lab4blist1 = [1,2,3,4,5]list2 = [2,1,0,-1,-2]join_lists(list1,list2)[0, 1, 2, 3, 4, 5, -2, -1]match_lists(list1,list2) [8, 9, 5, 6, 7]diff_lists(list1,list2) [1, 2, 3, 4, 10, 11, 12, 13, 14]</source>:::3. Exit the ipython3 shell, download the checking script and check your work. Enter the following commands from the bash shell.<source>cd ~/ops435/lab4/pwd #confirm that you are in the right directoryls CheckLab4.py || wget matrix.senecac.on.ca/~acoatley-willis/CheckLab4.pypython3 ./CheckLab4.py -f -v lab4b</source>:::4. Before proceeding, make certain that you identify any and all errors in lab4b.py. When the checking script tells you everything is OK before proceeding to the next step.
== PART 3 - Dictionary ==
:In Python a Dictionary is a set of key-value pairs. Dictionaries are unordered, like sets, however any value can be retrieved from a dictionary if you know the key. This section will go over how to create, access, and change dictionaries, providing a new tool to store and manipulate data with.
:'''Perform the Following Steps:'''
::#Start the ipython3 shell:<source>
ipython3
</source>
::#Start by creating a new dictionary to practice with:<source>
dict_york = {'Address': '70 The Pond Rd', 'City': 'Toronto', 'Postal Code': 'M3J3M6'}
</source>
::#The syntax here is to use '''{}''' to create a dictionary and placing key:value pairs inside separated by commas.
::#Take a close look at all the available functions available to dictionary objects<source>
dir(dict_york)
help(dict_york)
</source>
::#All values can be viewed by using the dictionary.values() function. This function provides a '''list''' containing all values<source>
help(dict_york.values)
dict_york.values()
</source>
::#All keys can be viewed by using the dictionary.keys() function. This function provides a '''list''' containing all keys<source>
help(dict_york.keys)
dict_york.keys()
</source>
::#We can retrieve individual values from a dictionary by provide the key associated with the value<source>
dict_york['Address']
dict_york['Postal Code']
</source>
::#Dictionary keys can be any immutable values, such as: strings, numbers, and tuples. Trying adding a couple new keys and values to the dictionary<source>
dict_york['Country'] = 'Canada'
dict_york
dict_york.values()
dict_york.keys()
</source>
::#Study the output and add another key:value pair<source>
dict_york['Province'] = 'BC'
dict_york
dict_york.values()
dict_york.keys()
</source>
::#Dictionary keys must be unique. Attempting to add a key that already exists in the dictionary will overwrite the existing value for that key<source>
dict_york['Province'] = 'ON'
dict_york
dict_york.values()
dict_york.keys()
</source>
::#These lists that contain the values and keys of the dictionary are not real python lists, they are view of the dictionary
::#However we can change these from views into usable lists by using the list() function, the index can be used to access individual values<source>
list_of_keys = list(dict_york.keys())
list_of_keys[0]
</source>
::#Lists can be changed into sets if we would like to perform comparisons with another set<source>
set_of_keys = set(dict_york.keys())
set_of_values = set(dict_york.values())
set_of_keys | set_of_values
</source>
::#The lists can be iterated over in a for loop<source>
list_of_keys = list(dict_york.keys())
for key in list_of_keys:
print(key)
for value in dict_york.values()
print(value)
</source>
::#The values and keys can be looped over using the index as well
::#The range() function provides a list of numbers in a range.
::#The len() provides a the number of items in a list. Used together len() and range() create a list of usable indexes for a specific list<source>
list_of_keys = list(dict_york.keys())
list_of_values = list(dict_york.values())
list_of_indexes = range(0, len(dict_york.keys()))
list_of_indexes
list_of_keys[0]
list_of_values[0]
</source>
::#Using this this list of indexes we are able to pair the keys and values of two separate lists<source>
list_of_keys = list(dict_york.keys())
list_of_values = list(dict_york.values())
for index in range(0, len(list_of_keys)):
print(list_of_keys[index] + '--->' + list_of_values[index])
</source>
::#Looping using indexes is not the best way to loop through a dictionary. A new dictionary could be created using this method, but this is '''not good'''<source>
list_of_keys = list(dict_york.keys())
list_of_values = list(dict_york.values())
new_dictionary = {}
for index in range(0, len(list_of_keys)):
new_dictionary[list_of_keys[index]] = list_of_values[index]
</source>
::#The above method uses a lot of memory and loops. The best method to create a dictionary from two lists is to use the zip() function<source>
list_of_keys = list(dict_york.keys())
list_of_values = list(dict_york.values())
new_dictionary = dict(zip(list_of_keys, list_of_values))
</source>
::#Loop through the keys in a dictionary also provides a easy way to get the value for each key at the same time<source>
for key in dict_york.keys():
print(key + '--->' + dict_york[key])
</source>
::#Even better than the above, both key and value can be extracted in a single for loop using a special object<source>
for key, value in dict_york.items():
print(key + ' | ' + value)
</source>
'''Create a Python Script for Managing Dictionaries'''
:'''Perform the Following Instructions'''
::#Create the '''~/ops435/lab4/lab4c.py''' script. The purpose of this script will be to create dictionaries, extract data from dictionaries, and to make comparisons between dictionaries.
::#Use the following as a template:<source>
#!/usr/bin/env python3
# Dictionaries
dict_york = {'Address': '70 The Pond Rd', 'City': 'Toronto', 'Country': 'Canada', 'Postal Code': 'M3J3M6', 'Province': 'ON'}
dict_newnham = {'Address': '1750 Finch Ave E', 'City': 'Toronto', 'Country': 'Canada', 'Postal Code': 'M2J2X5', 'Province': 'ON'}
# Lists
list_keys = ['Address', 'City', 'Country', 'Postal Code', 'Province']
list_values = ['70 The Pond Rd', 'Toronto', 'Canada', 'M3J3M6', 'ON']
def create_dictionary(keys, values):
# Place code here
def split_dictionary(dictionary):
# Place code here
def shared_values(dict1, dict2):
# Place code here
if __name__ == '__main__':
york = create_dictionary(list_keys, list_values)
print('York: ', york)
keys, values = split_dictionary(dict_newnham)
print('Newnham Keys: ', keys)
print('Newnham Values: ', values)
keys, values = split_dictionary(york)
print('York Keys: ', keys)
print('York Values: ', values)
common = shared_values(dict_york, dict_newnham)
print('Shared Values', common)
</source>
:::*The script should contain '''three''' functions
:::*create_dictionary() accepts two lists as arguments keys and values, combines these lists together to create a dictionary
:::*create_dictionary() '''returns a dictionary''' that has the keys and associated values from the lists
:::*split_dictionary() accepts a single dictionary as a argument and splits the dictionary into two lists, keys and values
:::*split_dictionary() '''returns two lists''': return keys, values
:::*shared_values() accepts two dictionaries as arguments finds all values that are shared between the two dictionaries
:::*shared_values() '''returns a set''' containing ONLY values found in BOTH dictionaries
:::*make sure the functions have the correct number of arguments required
:::*The script should show the exact output as the samples
:::*The script should contain no errors
::::'''Sample Run 1:'''<source>
run lab4c.py
York: {'Country': 'Canada', 'Postal Code': 'M3J3M6', 'Address': '70 The Pond Rd', 'Province': 'ON', 'City': 'Toronto'}
Newnham Keys: ['Country', 'Postal Code', 'Address', 'Province', 'City']
Newnham Values: ['Canada', 'M2J2X5', '1750 Finch Ave E', 'ON', 'Toronto']
York Keys: ['Country', 'Postal Code', 'Address', 'Province', 'City']
York Values: ['Canada', 'M3J3M6', '70 The Pond Rd', 'ON', 'Toronto']
Shared Values {'Canada', 'ON', 'Toronto'}
</source>
::::'''Sample Run 2(with import):'''<source>
import lab4c
dict_york = {'Address': '70 The Pond Rd', 'City': 'Toronto', 'Country': 'Canada', 'Postal Code': 'M3J3M6', 'Province': 'ON'}
dict_newnham = {'Address': '1750 Finch Ave E', 'City': 'Toronto', 'Country': 'Canada', 'Postal Code': 'M2J2X5', 'Province': 'ON'}
list_keys = ['Address', 'City', 'Country', 'Postal Code', 'Province']
list_values = ['70 The Pond Rd', 'Toronto', 'Canada', 'M3J3M6', 'ON']
york = create_dictionary(list_keys, list_values)
york
{'Address': '70 The Pond Rd',
'City': 'Toronto',
'Country': 'Canada',
'Postal Code': 'M3J3M6',
'Province': 'ON'}
keys, values = split_dictionary(dict_newnham)
keys
['Country', 'Postal Code', 'Address', 'Province', 'City']
values
['Canada', 'M2J2X5', '1750 Finch Ave E', 'ON', 'Toronto']
keys, values = split_dictionary(york)
keys
['Country', 'Postal Code', 'Address', 'Province', 'City']
values
['Canada', 'M3J3M6', '70 The Pond Rd', 'ON', 'Toronto']
common = shared_values(dict_york, dict_newnham)
common
{'Canada', 'ON', 'Toronto'}
</source>
:::3. Exit the ipython3 shell, download the checking script and check your work. Enter the following commands from the bash shell.<source>
cd ~/ops435/lab4/
pwd #confirm that you are in the right directory
ls CheckLab4.py || wget matrix.senecac.on.ca/~acoatley-willis/CheckLab4.py
python3 ./CheckLab4.py -f -v lab4c
</source>
:::4. Before proceeding, make certain that you identify any and all errors in lab4c.py. When the checking script tells you everything is OK before proceeding to the next step.
== PART 4 - List Comprehension ==
:We've already covered lists to a degree. Lets move into more advanced functions to use and generate lists. This is a very common practice in Python, understanding how to generate, manipulate, and apply functions to items inside a list can be incredibly useful. List comprehension is a way to build new lists from existing listand to do it faster than simply looping over lists.
Lets start with creating a list and applying some function to each item in the list. The below will print out the square of each item.
<presource>
l1 = [1, 2, 3, 4, 5]
for item in l1:
print(item ** 2)
</presource>
If we would like to store these squares for later use, we can create a new list and append the squares to it. This will generate a new list that contains squared values in the same positions of the first list. What we are doing is using an existing list to create a new list.
<presource>
l1 = [1, 2, 3, 4, 5]
l2 = []
l1
l2
</presource>
Lets take another step here. Lets move the squaring of numbers out into it's own separate function. While the squaring example is a simple function, this example could include a more complex functions that does a lot more processing on each item in the list.
<presource>
def square(number):
return number ** 2
l1
l2
</presource>
The map function can be used to apply a function on each item in a list. This is exactly what we did above, however it gives us much better syntax, removes the loop, including the variable we had to create to do the loop. This will make our work a little more efficient while performing the same task.
<presource>
def square(number):
return number ** 2
l1
l2
</presource>
The above map function required us to provide it with a function, and a list. This meant that before we could use map we needed to define a function earlier in the script. We can avoid this entire function definition through the use of anonymous functions. This is the ability to create a simple function without defining it, and pass it off for use. Below we will use lambda, which will return a function, and we can use that function immediately. The function takes 1 argument x, and it will perform a single operation on x, square it.
<presource>
square = lambda x: x ** 2
l1 = [1,2,3,4,5]
l1
l2
</presource>
The above code is actually not particularly good, the whole purpose of using lambda here is we were avoiding the function definition and just quickly returning a function. However this does break down exactly what lambda does, it returns a function for use. Lets fix this and remove the square function and just use the return function from lambda. Now remember what map requires? map's first argument is a function, and map's second argument is a list. Here lambda will return a function and provide it as the first argument.
<presource>
l1 = [1,2,3,4,5]
l2 = list(map(lambda x: x ** 2, l1))
l1
l2
</presource>
= Investigation INVESTIGATION 2: STRINGS =
:Strings are in their most basic form a list of characters, or a bit of text. Strings store text so that we can use them later. In this section we will cover more than just displaying that text to the screen. Here, we will go over cutting strings into sub-strings, joining strings together, searching through strings, and matching strings against patterns.
:We can concatenate strings using the plus sign. Combining strings together to create a brand new string, strings are immutable just like tuples. This means everytime you change a string, you are actually creating a new string.
<presource>
str1 = 'Paul'
str2 = 'Atreides'
str3 = str1 + ' ' + str2
str3
</presource>
Repetition is also a useful tool that can be used with strings. Repetition repeats the string over and over a specific amount of times. This is useful anytime you would manually be typing the same thing over again.
<presource>
str1 = 'Paul'
str2 = 'Atreides'
str3 = str1 + ' ' + str2 + ' ' + 'I'*3
str3
</presource>
== PART 1 - String Manipulation ==