OBJECTIVES

This lab will provide you will additional scripting tools to help us write even more effective Python scripts to be applied to practical application involving VM management and deployment in future labs.

The first investigation in this lab will focus on Data Structures. In Wikipedia (http://searchsqlserver.techtarget.com/definition/data-structure)
"A data structure is defined as a specialized format for organizing and storing data. Any data structure is designed to organize data to suit a specific purpose so that it can be accessed and worked with in appropriate ways."

Each data structure has its own advantages and limitations. This lab will emphasize the most important differences as they relate to Python scripting.

The second investigation will focus closely on strings. You have been using and storing strings since our first class, however in this lab we will dive into the more complex nature of string manipulation. Finally, this lab will cover how to use a variety of different regular expression functions, for searching and input validation.

PYTHON REFERENCE

As you develop your Python scripting skills, you may start to be "overwhelmed" with the volume of information that you have absorbed over these labs. One way to help, is to write what you have learned in your labs into your lab logbook. Also, in programming, it is important to use online references in order to obtain information regarding Python scripting techniques and tools.

Below is a table with links to useful online Python reference sites (by category). You may find these references useful when performing assignments, etc.

Data Structures	Lists & List Comprehension	Strings	Regular Expressions	Miscellaneous
Tuples Sets	Lists More on Lists List Comprehensions	Strings String Comparisons	Regular Expression Operations Regular Expressions (HOWTO)	Dictionaries

INVESTIGATION 1: DATA STRUCTURES

In this investigation, you will learn several tools when using data structures in Python scripting.

These tools include tuples, sets, dictionaries, and more advanced list functions.

PART 1 - Tuples

Many often confuse a tuple with a list (which you learned about in a previous lab). A tuple is a type of list whose values cannot be changes. In fact, the structure of a tuple cannot be changed (like adding, removing list elements).

There are many advantages to using tuples when creating Python scripts:

Data protection (eg. values are are NOT allowed to change like income tax rate, social insurance number, etc)
The data structure in a tuple cannot be changed (eg. structure cannot be corrupted)
Tuples can be used as keys in data dictionaries (which are NOT allowed to change)
Tuples allow for faster access than lists

Term to indicate that a data structure cannot be changed is called immutable (as opposed to "mutable" which means the data structure can be changed).

Perform the Following Steps:

Launch your ipython3 shell:
```
ipython3
```
Let's create two tuples, so we can learn how to use them and learn how they differ from lists.

Note: tuples are defined by using parenthesis ( ) as opposed to lists are defined by using square brackets [ ]

Issue the following:

t1 = ('Prime', 'Ix', 'Secundus', 'Caladan')
t2 = (1, 2, 3, 4, 5, 6)

Values from a tuple can be retrieved in the same way as a list. For example, issue the following:
```
t1[0]
t2[2:4]
```
You can also check to see whether a value exists inside a tuple or not. To demonstrate, issue the following:
```
'Ix' in t1
'Geidi' in t1
```
Let's now see how a tuple differs from a list. We will now create a list and note the difference between them.

Issue the following to create a list:

list2 = [ 'uli101', 'ops235', 'ops335', 'ops435', 'ops535', 'ops635' ]

See if you can change the value of your list by issuing the following:
```
list2[0]= 'ica100'
list2[0]
print(list2)
```
.You should have been successful in changing the value of your list.
Now, try changing the value of your previously-created tuple by issuing:
```
t2[1] = 10
```
Did it work? Once created the tuple values will not be able to change.

If you would like a tuple with different values than the tuple you currently have, then you must create a new one.
To create a new tuple, issue the following:
```
t3 = t2[2:3]
```
You can use most of the basic operations with tuples as you did with lists.

To demonstrate, issue the following:

len(t1)     # list the length of the tuple
t1 * 3      # repetition
t1 + t2     # concatenation, remember this is creating a new tuple, not modifying

Also, as with lists, you can use loops with tuples. Issue the following to demonstrate:
```
for item in t1:
    print('item: ' + item)
```

PART 2 - Sets

So far, you have been exposed to two structures that are used to contain data: lists and tuples. You can modify the values within a list as well as modify the structure of a list (i.e. add and remove elements), whereby you cannot with a tuple.

In this section, you will learn about sets. A set has similar characteristics as a list, but there are two major differing characteristics:

Sets are un-ordered
Sets cannot contain duplicate values

Since new duplicate entries will be automatically removed when using sets, they are very useful for performing tasks such as comparisons: finding similarities or differences in multiple sets. Also, sets are considered to be fast!

Perform the Following Steps:

Within your ipython3 shell, create a few sets to work with by issuing the following:
```
s1 = {'Prime', 'Ix', 'Secundus', 'Caladan'}
s2 = {1, 2, 3, 4, 5}
s3 = {4, 5, 6, 7, 8}
```
Note: Sets are defined by using braces { } as opposed to tuples that use parenthesis ( ), or lists that use square brackets [ ]
Try to issue the following to access a set through the index.
```
s1[0]
```
This should have created an error, this is not how to access data inside a set because they are un-ordered. Instead, you should use the method (used in the previous section) to check to see if a value is contained within the set.
To demonstrate, issue the following:
```
'Ix' in s1
'Geidi' in s1
```
Sets can be combined, but it is important to note that any duplicate values (shared among sets) will be deleted.
Issue the following, and note the items (and values) that are common to the following sets:
```
s2
s3
```
Now, issue the following to return a set containing only UNIQUE values (no duplicates) from both sets:
```
s2 | s3         # returns a set containing all values from both sets
s2.union(s3)    # same as s2 | s3
```
Notice that both methods above provides the same result, but the first method requires less keystrokes.

Instead of combining sets, we can display values that are common to both sets. This is known in mathematical terms as an intersection between the lists.

To demonstrate intersection between sets s2 and s3, issue the following:

s2 & s3             # returns a set containing all values that s2 and s3 share
s2.intersection(s3) # same as s2 & s3

Sets can also have their values compared against other sets. First find out what items are in s2 but not in s3. This is also called a difference. But notice that it only shows values that s2 contains, specifically values that s3 doesn't have. So this isn't really the true difference between the sets.
```
s2
s3
s2 - s3             # returns a set containing all values in s2 that are not found s3
s2.difference(s3)   # same as s2 - s3
```
In order to see every difference between both sets, you need to find the symmetric difference. This will return a set that shows all numbers that both sets do not share together.
To demonstrate, issue the following:
```
s2 ^ s3                     # returns a set containing all values that both sets DO NOT share
s2.symmetric_difference(s3) # same as s2 ^ s3
```
Note: the set() function can make lists into sets, and the list() function can make sets into lists

These powerful features can be useful and efficient. Unfortunately, lists cannot perform these operations, unless we have to convert the lists into sets. In order to that, you should first perform a comparison, then convert the list to a set.

There are two problems with performing the above-mentioned technique:

Sets are un-ordered so if the list order is important this will cause problems and remove order
Sets cannot contain duplicate values, if the list contains any duplicate values they will be deleted.

However, if the list does not have any of the above requirements this is a great solution to some problems.

10. To demonstrate, issue the following:

l2 = [1, 2, 3, 4, 5]
l3 = [4, 5, 6, 7, 8]
new_list = list(set(l2).intersection(set(l3)))  # '''set()''' can make lists into sets. '''list()''' can make sets into lists
new_list

Create a Python Script Demonstrating Comparing Sets

Perform the Following Instructions

Create the ~/ops435/lab4/lab4a.py script. The purpose of this script will be to demonstrate the different way of comparing sets. There will be three functions, each returning a different set comparison.

Use the following template to get started:

#!/usr/bin/env python3

def join_sets(set1, set2):
    # join_sets will return a set that has every value from both set1 and set2 inside it

def match_sets(set1, set2):
    # match_sets will return a set that contains all values found in both set1 and set2

def diff_sets(set1, set2):
    # diff_sets will return a set that contains all different values which are not shared between the sets

if __name__ == '__main__':
    set1 = set(range(1,10))
    set2 = set(range(5,15))
    print('set1: ', set1)
    print('set2: ', set2)
    print('join: ', join_sets(set1, set2))
    print('match: ', match_sets(set1, set2))
    print('diff: ', diff_sets(set1, set2))

The match_sets() function should return a set that contains all values found in both sets
The diff_sets() function should return a set that contains all values which are not shared between both sets
The join_sets() function should return a set that contains all values from both sets
All three functions should accept two arguments both are sets
The script should show the exact output as the samples
The script should contain no errors

Sample Run 1:

run lab4a.py
set1:  {1, 2, 3, 4, 5, 6, 7, 8, 9}
set2:  {5, 6, 7, 8, 9, 10, 11, 12, 13, 14}
join:  {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14}
match:  {8, 9, 5, 6, 7}
diff:  {1, 2, 3, 4, 10, 11, 12, 13, 14}

Sample Run 2 (with import):

import lab4a
set1 = {1,2,3,4,5}
set2 = {2,1,0,-1,-2}
lab4a.join_sets(set1,set2)
{-2, -1, 0, 1, 2, 3, 4, 5}
lab4a.match_sets(set1,set2)
{1, 2}
lab4a.diff_sets(set1,set2)
{-2, -1, 0, 3, 4, 5}

3. Exit the ipython3 shell, download the checking script and check your work. Enter the following commands from the bash shell.

cd ~/ops435/lab4/
pwd #confirm that you are in the right directory
ls CheckLab4.py || wget matrix.senecac.on.ca/~acoatley-willis/CheckLab4.py
python3 ./CheckLab4.py -f -v lab4a

4. Before proceeding, make certain that you identify any and all errors in lab4a.py. When the checking script tells you everything is OK before proceeding to the next step.

Create a Python Script Demonstrating Comparing Lists

Perform the Following Instructions

Create the ~/ops435/lab4/lab4b.py script. The purpose of this script will be to improve the previous script to perform the same joins, matches, and diffs, but this time on lists.

Use the following as a template:

#!/usr/bin/env python3

def join_lists(list1, list2):
    # join_lists will return a list that contains every value from both list1 and list2 inside it

def match_lists(list1, list2):
    # match_lists will return a list that contains all values found in both list1 and list2

def diff_lists(list1, list2):
    # diff_lists will return a list that contains all different values, which are not shared between the lists

if __name__ == '__main__':
    list1 = list(range(1,10))
    list2 = list(range(5,15))
    print('list1: ', list1)
    print('list2: ', list2)
    print('join: ', join_lists(list1, list2))
    print('match: ', match_lists(list1, list2))
    print('diff: ', diff_lists(list1, list2))

The match_lists() function should return a list that contains all values found in both lists
The diff_lists() function should return a list that contains all values which are not shared between both lists
The join_lists() function should return a list that contains all values from both sets
All three functions should accept two arguments both are lists
The script should show the exact output as the samples
The script should contain no errors

Sample Run 1:

run lab4b.py
list1:  [1, 2, 3, 4, 5, 6, 7, 8, 9]
list2:  [5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
join:  [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
match:  [8, 9, 5, 6, 7]
diff:  [1, 2, 3, 4, 10, 11, 12, 13, 14]

Sample Run 2 (with import):

import lab4b
list1 = [1,2,3,4,5]
list2 = [2,1,0,-1,-2]
join_lists(list1,list2)
[0, 1, 2, 3, 4, 5, -2, -1]
match_lists(list1,list2)                                                                                                                  
[8, 9, 5, 6, 7]
diff_lists(list1,list2)                                                                                                                   
[1, 2, 3, 4, 10, 11, 12, 13, 14]

3. Exit the ipython3 shell, download the checking script and check your work. Enter the following commands from the bash shell.

cd ~/ops435/lab4/
pwd #confirm that you are in the right directory
ls CheckLab4.py || wget matrix.senecac.on.ca/~acoatley-willis/CheckLab4.py
python3 ./CheckLab4.py -f -v lab4b

4. Before proceeding, make certain that you identify any and all errors in lab4b.py. When the checking script tells you everything is OK before proceeding to the next step.

PART 3 - Dictionaries

By now, you have probably been exposed to database terminology. For example, a database is a collection of related records. In turn, records are a collection of related fields. In order to access a record in a database, you would need to access it by key field(s). In order words, those key field(s) are a key that unlocks the access to a record within a database.

In Python, a dictionary is a set of key-value pairs. Dictionaries are unordered, like sets, however any value can be retrieved from a dictionary if you know the key. This section will go over how to create, access, and change dictionaries, providing a new powerful tool to store and manipulate data.

Perform the Following Steps:

Launch the ipython3 shell:
```
ipython3
```
Let's begin by creating a new dictionary (for practice):
```
dict_york = {'Address': '70 The Pond Rd', 'City': 'Toronto', 'Postal Code': 'M3J3M6'}
```
You should note that the syntax to define a dictionary is similar to defining sets (i.e. using {}).
Unlike sets, dictionaries use key:value pairs within the dictionary, each key:value pair in turn, are separated by commas.

You can get help associated with your dictionary by using functions such as dir() and help().
Issue the following and note all the available functions available and how to obtain assistance with dictionary objects:
```
dir(dict_york)
help(dict_york)
```
All values can be viewed by using the dictionary.values() function. This particular function provides a list containing all values.
To demonstrate, issue the following:
```
help(dict_york.values)
dict_york.values()
```
All keys to access the key:pair values within a dictionary can be viewed by using the dictionary.keys() function. This function provides a list containing all keys
To demonstrate this, issue the following:
```
help(dict_york.keys)
dict_york.keys()
```
Armed with this information, We can retrieve individual values from a dictionary by provide the key associated with the key:pair value

For example, issue the following:

dict_york['Address']
dict_york['Postal Code']

Dictionary keys can be any immutable values (i.e. not permitted for value to be changed). Types of values include: strings, numbers, and tuples. Trying adding a couple new keys and values to the dictionary by issuing:
```
dict_york['Country'] = 'Canada'
dict_york
dict_york.values()
dict_york.keys()
```
Let's add another key:value pair to our dictionary to change the province key:pair value to BC:
```
dict_york['Province'] = 'BC'
dict_york
dict_york.values()
dict_york.keys()
```
WARNING: Dictionary keys must be unique. Attempting to add a key that already exists in the dictionary will overwrite the existing value for that key!
To demonstrate, issue the following:
```
dict_york['Province'] = 'ON'
dict_york
dict_york.values()
dict_york.keys()
```
You should notice that key value for 'Province' has been changed back to 'ON'.

These lists that contain the values and keys of the dictionary are not real python lists - they are "views of the dictionary" and therefore are immutable. You could change these views into usable lists by using the list() function (where the index can be used to access individual values).

For example, issue the following:

list_of_keys = list(dict_york.keys())
list_of_keys[0]

In addition, lists can be changed into sets if we would like to perform comparisons with another set. To demonstrate, issue the following:
```
set_of_keys = set(dict_york.keys())
set_of_values = set(dict_york.values())
set_of_keys | set_of_values
```
Lists can be used with for loops. To Demonstrate, issue the following:
```
list_of_keys = list(dict_york.keys())
for key in list_of_keys:
    print(key)
for value in dict_york.values()
    print(value)
```
Additional Information regarding Dictionaries:
- The values and keys can be looped over using the index as well
- The range() function provides a list of numbers in a range.
- The len() function provides a the number of items in a list.
- Used together len() and range() can be used to create a list of usable indexes for a specific list
Let's create a dictionary by using lists in order to store our dictionary data. First, we need to pair the keys and values of two separate lists.

Issue the following:

list_of_keys = list(dict_york.keys())
list_of_values = list(dict_york.values())
list_of_indexes = range(0, len(dict_york.keys()))
list_of_indexes
list_of_keys[0]
list_of_values[0]

Now, let's use these newly-created lists, len() & range() functions with a for loop to construct our dictionary:

Issue the following:

list_of_keys = list(dict_york.keys())
list_of_values = list(dict_york.values())
for index in range(0, len(list_of_keys)):
    print(list_of_keys[index] + '--->' + list_of_values[index])

Looping using indexes is not the best way to loop through a dictionary. A new dictionary could be created using this method, but this is not good:

list_of_keys = list(dict_york.keys())
list_of_values = list(dict_york.values())
new_dictionary = {}
for index in range(0, len(list_of_keys)):
    new_dictionary[list_of_keys[index]] = list_of_values[index]

The above method uses a lot of memory and loops. The best method to create a dictionary from two lists is to use the zip() function:

list_of_keys = list(dict_york.keys())
list_of_values = list(dict_york.values())
new_dictionary = dict(zip(list_of_keys, list_of_values))

Looping through the keys in a dictionary also provides a easy way to get the value for each key at the same time:
```
for key in dict_york.keys():
    print(key + '--->' + dict_york[key])
```
An alternative (possibly more efficient) method would be to cause both the key and its value to be extracted into a single (using a for loop, and using a special object):
```
for key, value in dict_york.items():
    print(key + ' | ' + value)
```

Create a Python Script for Managing Dictionaries

Perform the Following Instructions

Create the ~/ops435/lab4/lab4c.py script. The purpose of this script will be to create dictionaries, extract data from dictionaries, and to make comparisons between dictionaries.

Use the following as a template:

#!/usr/bin/env python3

# Dictionaries
dict_york = {'Address': '70 The Pond Rd', 'City': 'Toronto', 'Country': 'Canada', 'Postal Code': 'M3J3M6', 'Province': 'ON'}
dict_newnham = {'Address': '1750 Finch Ave E', 'City': 'Toronto', 'Country': 'Canada', 'Postal Code': 'M2J2X5', 'Province': 'ON'}
# Lists
list_keys = ['Address', 'City', 'Country', 'Postal Code', 'Province']
list_values = ['70 The Pond Rd', 'Toronto', 'Canada', 'M3J3M6', 'ON']

def create_dictionary(keys, values):
    # Place code here - refer to function specifics in section below

def split_dictionary(dictionary):
    # Place code here - refer to function specifics in section below
       
def shared_values(dict1, dict2):
    # Place code here - refer to function specifics in section below


if __name__ == '__main__':
    york = create_dictionary(list_keys, list_values)
    print('York: ', york)
    keys, values = split_dictionary(dict_newnham)
    print('Newnham Keys: ', keys)
    print('Newnham Values: ', values)
    keys, values = split_dictionary(york)
    print('York Keys: ', keys)
    print('York Values: ', values)
    common = shared_values(dict_york, dict_newnham)
    print('Shared Values', common)

The script should contain three functions:

create_dictionary()

accepts two lists as arguments keys and values, combines these lists together to create a dictionary
returns a dictionary that has the keys and associated values from the lists

split_dictionary()

accepts a single dictionary as a argument and splits the dictionary into two lists, keys and values
returns two lists: The return function can return multiple lists (separated by a comma). In our case, use: return keys, values

shared_values()

accepts two dictionaries as arguments and finds all values that are shared between the two dictionaries
(Tip: generate sets containing only values for each dictionary, then use a function mentioned in a previous section to store the values that are common to both lists)
returns a set containing ONLY values found in BOTH dictionaries

make sure the functions have the correct number of arguments required
The script should show the exact output as the samples
The script should contain no errors

Sample Run 1:

run lab4c.py
York:  {'Country': 'Canada', 'Postal Code': 'M3J3M6', 'Address': '70 The Pond Rd', 'Province': 'ON', 'City': 'Toronto'}
Newnham Keys:  ['Country', 'Postal Code', 'Address', 'Province', 'City']
Newnham Values:  ['Canada', 'M2J2X5', '1750 Finch Ave E', 'ON', 'Toronto']
York Keys:  ['Country', 'Postal Code', 'Address', 'Province', 'City']
York Values:  ['Canada', 'M3J3M6', '70 The Pond Rd', 'ON', 'Toronto']
Shared Values {'Canada', 'ON', 'Toronto'}

Sample Run 2(with import):

import lab4c
dict_york = {'Address': '70 The Pond Rd', 'City': 'Toronto', 'Country': 'Canada', 'Postal Code': 'M3J3M6', 'Province': 'ON'}
dict_newnham = {'Address': '1750 Finch Ave E', 'City': 'Toronto', 'Country': 'Canada', 'Postal Code': 'M2J2X5', 'Province': 'ON'}
list_keys = ['Address', 'City', 'Country', 'Postal Code', 'Province']
list_values = ['70 The Pond Rd', 'Toronto', 'Canada', 'M3J3M6', 'ON']

york = create_dictionary(list_keys, list_values)

york
{'Address': '70 The Pond Rd',
 'City': 'Toronto',
 'Country': 'Canada',
 'Postal Code': 'M3J3M6',
 'Province': 'ON'}

keys, values = split_dictionary(dict_newnham)

keys
['Country', 'Postal Code', 'Address', 'Province', 'City']

values
['Canada', 'M2J2X5', '1750 Finch Ave E', 'ON', 'Toronto']

keys, values = split_dictionary(york)

keys
['Country', 'Postal Code', 'Address', 'Province', 'City']

values
['Canada', 'M3J3M6', '70 The Pond Rd', 'ON', 'Toronto']

common = shared_values(dict_york, dict_newnham)

common
{'Canada', 'ON', 'Toronto'}

3. Exit the ipython3 shell, download the checking script and check your work. Enter the following commands from the bash shell.

cd ~/ops435/lab4/
pwd #confirm that you are in the right directory
ls CheckLab4.py || wget matrix.senecac.on.ca/~acoatley-willis/CheckLab4.py
python3 ./CheckLab4.py -f -v lab4c

4. Before proceeding, make certain that you identify any and all errors in lab4c.py. When the checking script tells you everything is OK before proceeding to the next step.

PART 4 - List Comprehension

We have already have had an introduction to lists. We will now explore advanced functions that use and generate lists. This is a very common practice in Python: understanding how to generate, manipulate, and apply functions to items inside a list can be incredibly useful. List comprehension is a way to build new lists from existing list and to do it faster than simply looping over lists.

Perform the Following Steps

Let's start by creating a list and then applying some functions to each item in that list. Issue the following to create a list and then display the square for each item within that list:
```
l1 = [1, 2, 3, 4, 5]
for item in l1:
    print(item ** 2)
```
In order to store these results (i.e. squares) for later use, you would have to create a new list and append the squares to it. This will generate a new list that contains squared values in the same positions of the first list. In this way, you are using an existing list in order to create a new (larger) list.
To demonstrate, issue the following:
```
l1 = [1, 2, 3, 4, 5]
l2 = []
for item in l1:
    l2.append(item ** 2)
l1
l2
```
Since this may be a repetitive task, it makes more sense to create a function that will append the squares to a new item within an existing list.
Issue the following to see how that can be performed:
```
def square(number):
    return number ** 2

l1 = [1, 2, 3, 4, 5]
l2 = []
for item in l1:
    l2.append(square(item))

l1
l2
```
The map() function can be used to apply a function on each item in a list. This is exactly what happened in the previous example; however, using the map() function provides for better syntax, and removes the loop (including the variable that was created inside the loop). Therefore, using the map() function will make your Python script more efficient while performing the same task.
To demonstrate, issue the following:
```
def square(number):
    return number ** 2

l1 = [1,2,3,4,5]
l2 = list(map(square, l1))

l1
l2
```
The above map() function requires another function as well as a list. This means that before using (calling) the map() function, that other function would have to have been defined earlier in the script. This entire process can be avoided through the use of anonymous functions. This is the ability to create a simple function without defining it, and pass it on to other function calls. You will use the the lambda anonymous function, which will return a function that you can use in that function immediately (i.e. without having to declare it in your script). The function takes 1 argument (called: x), and it will square that value.

To demonstrate, issue the following:

square = lambda x: x ** 2
l1 = [1,2,3,4,5]
l2 = list(map(square, l1))

l1
l2

The above code is actually not particularly good, the whole purpose of using lambda here is we were avoiding the function definition and just quickly returning a function. However this does break down exactly what lambda does, it returns a function for use. Fix this by removing the square function and just use the return function from lambda. Now remember what map requires? map's first argument is a function, and map's second argument is a list. Here lambda will return a function and provide it as the first argument.
```
l1 = [1,2,3,4,5]
l2 = list(map(lambda x: x ** 2, l1))

l1
l2
```
Using the list comprehensions above our code will be faster and more efficient than using multiple variables and loops.

INVESTIGATION 2: STRINGS

Strings are basically a list of characters (bits of text). Strings store text so that they can be later for manipulation (by a wide range of functions). This section will investigate strings in more detail such as cutting strings into sub-strings, joining strings, formatting strings, searching through strings, and matching strings against patterns.

Strings are immutable data objects - this means that once a string is created, it cannot be modified. In order to make a change inside a string, you would first make a copy of the part of the string (i.e. sub-string) for manipulation.

PART 1 - Strings and Substrings

This first part will explain basic concepts of using strings, printing strings, and manipulating sub-strings.

Perform the Following Steps:

Launch the ipython3 shell
```
ipython3
```
Create strings to manipulate and print
```
course_name = 'Open System Automation'
course_code = 'OPS435'
course_number = 435
```
Strings can contain any characters inside them, whether they are letters, numbers, or symbols. In our ipython3 shell the values inside each string variable can be seen just by typing the string variable name. However, when writing python scripts, these string variables should be placed inside print() functions in order to display on the screen.

Strings can also be concatenated(combined together) by using the + sign, just make sure string are only concatenating other strings(no lists, no numbers, no dictionaries, etc)
To demonstrate, issue the following:
```
course_name
course_code
course_number
print(course_name)
print(course_code)
print(str(course_number))
print(course_name + ' ' + course_code + ' ' + str(course_number))
```
Strings can also use special syntax for string repetition by multiplying the string by a number. This will repeat that string that many times. Repetition with * is useful whenever a string needs to be repeated more than once
Issue the following:
```
print(course_name + '-' + course_code)
print(course_name + '-'*5 + course_code)
print(course_name + '-'*25 + course_code)
print('abc'*2)
print(course_code*5)
```
When using the print() function, you can display special characters. One such special character is the is the newline character (denoted by the symbol: \n). This allows you to separate content between new lines or empty lines.
To demonstrate, issue the following:
```
print('Line 1\nLine 2\nLine 3\n')
```
By using both string repetition and a newline character, multiple lines can be created at once. Issue the following:
```
print('Line 1' + '\n'*4 + 'Line 5\nLine 6')
```
Strings have many built-in functions that we can use to manipulate text. Let's take a look at the strings name space and the available functions:
```
dir(course_name)
help(course_name)
```

Lets try out several different functions. Refer back to the help() function for more information, these are quick ways to view strings in different ways. Issue the following:

course_name.lower()         # Returns a string in lower-case letters
course_name.upper()         # Returns a string in upper-case letters
course_name.swapcase()      # Returns a string with upper-case and lower-case letters swapped
course_name.title()         # Returns a string with upper-case first letter of each word lower on rest
course_name.capitalize()    # Returns a string with upper-case first letter in string lowere on rest

These values can be saved inside new strings and reused for any new tasks

lower_name = course_name.lower()    # Save returned string lower-case string inside new string variable
print(lower_name)

If a string contains many values separated by a single character, such as a space, the string can be split on those values and create a list of values
```
lower_name.split(' ')       # Provide the split() function with a character to split on
```

The above will return a list of strings, which we can access just like all of lists

list_of_strings = lower_name.split(' ')     # Split string on spaces and store the list in a variable
list_of_strings                             # Display list
list_of_strings[0]                          # Display first item in list

Since this list is actually a list of strings, any function that works on strings will work on items in the list
```
list_of_strings[0].upper()                  # Use the functmon after the index to interact with just a single string in the list
first_word = list_of_strings[0]
first_word
print(first_word)
```
The index that is used inside of lists is also used to access characters within a string. For pratice, let's create a new string, and start accessing the strings index.

Perform the following:

course_name = 'Open System Automation'
course_code = 'OPS435'
course_number = 435
course_code[0]                          # Return a string that is the first character in course_code
course_code[2]                          # Return a string that is the third character in course_code
course_code[-1]                         # Return a string that is the last character in course_code
str(course_number)[0]                   # Turn the integer into a string, return first character in that string
course_code[0] + course_code[1] + course_code[2]

While a list's index is for each item in the list, a string's index is for each character in the string. In a string this is called a substring, taking out a number of characters from the string and using this substring to either create a new string or display only a small portion of it

To demonstrate, issue the following:

course_name[0:4]                        # This will return the first four characters NOT including index 4 -> indexes 0,1,2,3 -> but not index 4
first_word = course_name[0:4]           # Save this substring for later use
course_code[0:3]                        # This will return the first three characters NOT including index 3 -> indexes 0,1,2 -> but not index 3

The index allows some extra functions using the colon and negative numbers

course_name = 'Open System Automation'
course_name[12:]                        # Return the substring '12' index until end of string
course_name[5:]                         # Return the substring '5' index until end of string
course_name[-1]                         # Return the last character

With negative indexes the index works from the right side of the string -1 being the last character, -2 second last character, and so on
```
course_name = 'Open System Automation'
course_name[-1]
course_name[-2]
```

This allows string to return the substrings like below

course_name = 'Open System Automation'
course_name[-10:]                                   # Return the last ten characters
course_name[-10:-6]                                 # Try and figure out what this is returning 
course_name[0:4] + course_name[-10:-6]              # Combine substrings together
substring = course_name[0:4] + course_name[-10:-6]  # Save the combined substring as a new string for later
substring

The real power found in substrings goes beyond just manually writing index values and getting back words. The next part of this investigation will cover how to search through a string for a specific word, letter, number, and return the index to that search result.

Create a Python Script Demostrating Substrings

Perform the Following Instructions

Create the ~/ops435/lab4/lab4d.py script. The purpose of this script is to demonstrate creating and manipulating strings. There will be four functions each will return a single string.

Use the following template to get started:

#!/usr/bin/env python3
# Strings 1

str1 = 'Hello World!!'
str2 = 'Seneca College'

num1 = 1500
num2 = 1.50

def first_five():

def last_seven():

def middle_number():

def first_three_last_three():

if __name__ == '__main__':
    print(first_five(str1))
    print(first_five(str2))
    print(last_seven(str1))
    print(last_seven(str2))
    print(middle_number(num1))
    print(middle_number(num2))
    print(first_three_last_three(str1, str2))
    print(first_three_last_three(str2, str1))

The first_five() function accepts a single string argument
The first_five() function returns a string that contains the first five characters of the argument given
The last_seven() function accepts a single string argument
The last_seven() function returns a string that contains the last seven characters of the argument given
The middle_number() function accepts a integer as a argument
The middle_number() function returns a string containing the second and third characters in the number
The first_three_last_three() function accepts two string arguments
The first_three_last_three() function returns a single string that starts with the first three characters of argument1 and ends with the last three characters of argument2
Example: first_three_last_three('abcdefg', '1234567') returns single string 'abc567'

Sample Run 1

run lab4d.py 
Hello
Senec
World!!
College
50
.5
Helege
Send!!

Sample Run 2(import)

import lab4d
str1 = 'Hello World!!'
str2 = 'Seneca College'
num1 = 1500
num2 = 1.50
lab4d.first_five(str1)
'Hello'
lab4d.first_five(str2)
'Senec'
lab4d.last_seven(str1)
'World!!'
lab4d.last_seven(str2)
'College'
lab4d.middle_number(num1)
'50'
lab4d.middle_number(num2)
'.5'
lab4d.first_three_last_three(str1, str2)
'Helege'
lab4d.first_three_last_three(str2, str1)
'Send!!'

3. Exit the ipython3 shell, download the checking script and check your work. Enter the following commands from the bash shell.

cd ~/ops435/lab4/
pwd #confirm that you are in the right directory
ls CheckLab4.py || wget matrix.senecac.on.ca/~acoatley-willis/CheckLab4.py
python3 ./CheckLab4.py -f -v lab4d

4. Before proceeding, make certain that you identify any and all errors in lab4d.py. When the checking script tells you everything is OK before proceeding to the next step.

PART 2 - String Formatting

In python string concatenation with plus signs and commas is very limited and becomes very messy when when used at length with many values/calculations. This section will cover the format() functions that every string is able to use. The format() function allows for well formatted code, aligning text, and converting values efficiently and cleanly. While this section uses lists and dictionaries, remember that these are lists of strings and dictionaries with string values.

Perform the Following Steps:

Start the ipython3 shell
```
ipython3
```
To start use the format() function on a string, .format() goes on the end of strings, arguments can be provided to the format() function
```
print( 'College is Great'.format() )
```
The above example does not actualy do any formatting, next add a string using the format() function arguments
```
print('College is {}'.format('Great'))
```
When format finds {} curly braces, it performs special actions. If format() finds {} with nothing inside it substitutes the string from it's arguments. These arguments are done by position left to right
```
print('{} {} {}'.format('a', 'b', 'c'))
```
However, using this positional while quick, easy, and clean has a issue, if more curly braces {} are in the string than in the format() arguments, it will create a error. This next string is going to throw a error
```
print('{} {} {} {}'.format('a', 'b', 'c'))
```
For situations like above, if reusing strings more than once is important, positional values can be placed inside the curly braces
```
print('{0} {1} {1} {2}'.format('a', 'b', 'c'))
print('{2} {2} {1} {0}'.format('a', 'b', 'c'))
```
These positions make formating each string much less prone to errors while also making the string being formatted much more clear. To improve on formatting further, provide the format() function with string variables
```
course_name = 'Open System Automation'
course_code = 'OPS435'
print('{0} {1} {0}'.format(course_code, course_name))
```
The format() function by default tries to use values as strings, all values is {} are displayed as strings unless extra options are given
```
course_number = 435             # This is an integer
print('This is displaying a string by default: {0}'.format(course_number))
```

Next place a list inside the format() function and access the values, there are two ways to use the list here

list1 = [1,2,3,4,5]                                            # While this is a list of numbers, format() by default tried to print everything as a str()
print('{0[0]} {0[1]} {0[2]} {0[3]} {0[4]}'.format(list1))      # Access the values of the list using indexes {position[index]} 
print('{0} {1} {2} {3} {4}'.format(*list1))                    # Use *list1 to expand the list into multiple positional arguments for format()
print('{} {} {} {} {}'.format(*list1))                         # Use *list1 to expand the list into multiple positional arguments for format()
print('{0} {1} {2} {3} {4}'.format(1,2,3,4,5))                 # Use *list1 in this situation is the same as this

Next place a dictionary inside the format() functions and access the values, again there are a few ways to use the dictionary

dict_york = {'Address': '70 The Pond Rd', 'City': 'Toronto', 'Country': 'Canada', 'Postal Code': 'M3J3M6', 'Province': 'ON'}
print('{}'.format(dict_york))                                  # Print out whole dictionary
print('{0}'.format(dict_york))                                 # Print out whole dictionary using format arguments position
print('{0[City]} {0[Country]}'.format(dict_york))              # Print out values using position and key {position[key]}

With dictionaries however, instead of using positional arguments 0 each access to a key, python allows for expansion of keyword arguments. First take a look at the example of a keyword arguments, place the keyword variable name in between {} and add the keyword argument to the format() function
```
print('{string1} is {string2} {string2}'.format(string1='College', string2='Great!'))
```

Variables may also be passed to these keyword arguments

college = 'Seneca College'
print('{string1} is {string2} {string2}'.format(string1=college, string2='Great!'))

Now back to dictionaries, using keyword arguments(sometimes referred to as kwargs in python). The dictionary can be quickly and easily expanded into these keyword arguments using the syntax **dictionary_name. Pay close attention to the keys inside the dictionary and the values associated with each key

dict_york = {'Address': '70 The Pond Rd', 'City': 'Toronto', 'Country': 'Canada', 'Postal Code': 'M3J3M6', 'Province': 'ON'}
print('{City} {Province} {Country}'.format(**dict_york))                                        # Uses the dictionary's keyword arguments
print('{City} {Province} {Country}'.format(City='Toronto', Province='ON', Country='Canada'))    # Creates new keyword arguments

PART 3 - String Formatting Expanded

This next section expands on how to use the format() function by bringing in the ability to work with numbers and align text.

Perform the Following Steps

Start the ipython3 shell
```
ipython3
```

Start with numbers first. Study the following examples showing the different ways to change the output inside the curly brace {} while formatting. This formatting change is done by placing a colon inside the curly braces, following by a letter {0:f}

number1 = 50
number2 = -50
number3 = 1.50
print('{0:f}'.format(number1))          # The 0 is the position just like other format() functions
print('{0:f}'.format(number2))          # The colon separate the postion/index areas from the extra functionality
print('{0:f}'.format(number3))          # The f stands for fixed point number

A fixed point number means that it can control the number of digits that come after the decimal point, try changing the .2 to any other number and experiment

print('{0:.0f}'.format(number1))        # Show no digits after decimal point
print('{0:.2f}'.format(number2))        # Show two digits after decimal point
print('{0:.1f}'.format(number3))        # Show one digit after decimal point

Numbers can be displayed with the - or + signs before the digits, this could be important in formatting

print('{0: f}'.format(number1))                     # Show a space where plus sign should be
print('{0: f}'.format(number2))                     # Shows negative sign normally
print('{0: f}\n{1: f}'.format(number1, number2))    # The space before the f lines up positive and negative numbers

Placing a + before the f changes the format so that plus signs show up for positive numbers and negative signs show up for negative numbers

print('{0:+f}'.format(number1))                     # Show a space where plus sign should be
print('{0:+f}'.format(number2))                     # Shows negative sign normally
print('{0:+f}\n{1:+f}'.format(number1, number2))    # The space before the f lines up positive and negative numbers

Combining fixed point positions and sign together look like the following

print('{0:+.2f}'.format(number1))
print('{0:+.2f}'.format(number2))
print('{0:+.2f}\n{1:+.2f}'.format(number1, number2))

In the event that the numbers being shown are all integers and do no require the decimal values, instead of {:f} use {:d} for decimal INTEGERS

print('{0:d}'.format(number1))
print('{0:d}'.format(number2))
print('{0: d}'.format(number1))
print('{0: d}'.format(number2))
print('{0:+d}'.format(number1))
print('{0:+d}'.format(number2))

When using {:d} be careful not to use numbers with a decimal value in them, the following will create a error
```
number3 = 1.50
print('{0:d}'.format(number3))
```
Next lets move on to allignment of text. This is the process of adding padding to the left of our text, the right of our text, or aligning to the centre(padding both left and right). Through alignment the field size of any string can be set. Start by using a string but placeing a number value after the colon {0:10}
```
string1 = 'hello'
string2 = 'world'
print('{0:10}{1}'.format(string1, string2))      # Make sure string1 is 10 characters aligned to the left
print('{0:6}{1}'.format(string1, string2))       # Make sure string1 is 6 characters aligned to the left
```

By default format() aligns to the left, the symbol to explicitely do this is < which looks like {0:<10} when used in the curely braces. Now whenever a different string is placed inside it always aligns to the left 10 characters. This allows for INCREDIBLY well structured output

print('{:<10} {:<10} {:<10}\n{:<10} {:<10} {:<10}'.format('abc', 'def', 'ghi', 'jkl123', 'mno123', 'pqr123'))       # Without positional argument numbers
print('{0:<10} {1:<10} {2:<10}\n{3:<10} {4:<10} {5:<10}'.format('abc', 'def', 'ghi', 'jkl123', 'mno123', 'pqr123')) # With positional argument numbers

Next try using right alignment with the > symbol replacing the left alignment

print('{:>10} {:>10} {:>10}\n{:>10} {:>10} {:>10}'.format('abc', 'def', 'ghi', 'jkl123', 'mno123', 'pqr123'))       # Without positional argument numbers
print('{0:>10} {1:>10} {2:>10}\n{3:>10} {4:>10} {5:>10}'.format('abc', 'def', 'ghi', 'jkl123', 'mno123', 'pqr123')) # With positional argument numbers

Finally center alignment with the ^ symbol

print('{:^10} {:^10} {:^10}\n{:^10} {:^10} {:^10}'.format('abc', 'def', 'ghi', 'jkl123', 'mno123', 'pqr123'))       # Without positional argument numbers
print('{0:^10} {1:^10} {2:^10}\n{3:^10} {4:^10} {5:^10}'.format('abc', 'def', 'ghi', 'jkl123', 'mno123', 'pqr123')) # With positional argument numbers

The alignment character can be changed to any other character that is wanted for the output. By default it's a space, but it could be anything else by adding an additional character {:M<10} such as M or * before the alignment character

print('{:*^10} {:*^10} {:*^10}\n{:*^10} {:*^10} {:*^10}'.format('abc', 'def', 'ghi', 'jkl123', 'mno123', 'pqr123'))       # Without positional argument numbers
print('{:.^10} {:.^10} {:.^10}\n{:.^10} {:.^10} {:.^10}'.format('abc', 'def', 'ghi', 'jkl123', 'mno123', 'pqr123'))       # Without positional argument numbers

Make the output look better by creating borders, create this function that displays some data enclosed in borders, try to understand what is happening. This function does not return any value, only prints formatted text
```
def print_with_borders():
    print('|{:-^10}|'.format('nums'))
    print('|{:^5}{:^5}|'.format(1,2))
    print('|{:^5}{:^5}|'.format(3,4))
    print('|{}|'.format('-'*10))

# Run the function
print_with_borders()
```

Create a Script Demonstrating Formatting Strings

Perform the Following Instructions:

Create the ~/ops435/lab4/lab4e.py script. The purpose of this script is to demonstrate formatting string output from a large data structure.

Use the following template to get started:

#!/usr/bin/env python3
# Formatted Strings

dict_york = {'Address': '70 The Pond Rd', 'City': 'Toronto', 'Country': 'Canada', 'Postal Code': 'M3J3M6', 'Province': 'ON'}
college = 'Seneca College'

def print_college_address():
    # Prints out the keys and values from the dictionary
    # Formats the output to have a title bar with a title
    # EXACT SAME output as the samples

if __name__ == '__main__':
    print_college_address(dict_york, college)

The print_college_address() function does NOT return anything
The print_college_address() function accept two arguments
The first argument is a dictionary
The second argument is a string
The title printed is center aligned by 40 characters using '-' instead of space for alignment
The keys printed are center aligned by 20 characters
The values printed are center aligned by 20 characters
The output must match the sample output EXACTLY if one character is off it will be wrong

Sample Run 1:

run lab4e.py
|-------------Seneca College-------------|
|      Address          70 The Pond Rd   |
|      Province               ON         |
|    Postal Code            M3J3M6       |
|        City              Toronto       |
|      Country              Canada       |
|----------------------------------------|

Sample Run 2(with import):

import lab4e
dict_york = {'Address': '70 The Pond Rd', 'City': 'Toronto', 'Country': 'Canada', 'Postal Code': 'M3J3M6', 'Province': 'ON'}
dict_newnham = {'Address': '1750 Finch Ave E', 'City': 'Toronto', 'Country': 'Canada', 'Postal Code': 'M2J2X5', 'Province': 'ON'}
college = 'Seneca College'

lab4e.print_college_address(dict_york, college)
|-------------Seneca College-------------|
|      Address          70 The Pond Rd   |
|      Province               ON         |
|    Postal Code            M3J3M6       |
|        City              Toronto       |
|      Country              Canada       |
|----------------------------------------|

lab4e.print_college_address(dict_newnham, college)
|-------------Seneca College-------------|
|      Address         1750 Finch Ave E  |
|      Province               ON         |
|    Postal Code            M2J2X5       |
|        City              Toronto       |
|      Country              Canada       |
|----------------------------------------|

3. Exit the ipython3 shell, download the checking script and check your work. Enter the following commands from the bash shell.

cd ~/ops435/lab4/
pwd #confirm that you are in the right directory
ls CheckLab4.py || wget matrix.senecac.on.ca/~acoatley-willis/CheckLab4.py
python3 ./CheckLab4.py -f -v lab4e

4. Before proceeding, make certain that you identify any and all errors in lab4e.py. When the checking script tells you everything is OK before proceeding to the next step.

LAB 4 SIGN-OFF (SHOW INSTRUCTOR)

Students should be prepared with all required commands (system information) displayed in a terminal (or multiple terminals) prior to calling the instructor for signoff.

Have Ready to Show Your Instructor:

✓ x

✓ Lab4 logbook notes completed

Practice For Quizzes, Tests, Midterm & Final Exam

x
x
x

CDOT Wiki ^β

OPS435 Python Lab 4

OBJECTIVES

PYTHON REFERENCE

INVESTIGATION 1: DATA STRUCTURES

PART 1 - Tuples

PART 2 - Sets

PART 3 - Dictionaries

PART 4 - List Comprehension

INVESTIGATION 2: STRINGS

PART 1 - Strings and Substrings

PART 2 - String Formatting

PART 3 - String Formatting Expanded

LAB 4 SIGN-OFF (SHOW INSTRUCTOR)

Practice For Quizzes, Tests, Midterm & Final Exam

CDOT Wiki β

OPS435 Python Lab 4

OBJECTIVES

PYTHON REFERENCE

INVESTIGATION 1: DATA STRUCTURES

PART 1 - Tuples

PART 2 - Sets

PART 3 - Dictionaries

PART 4 - List Comprehension

INVESTIGATION 2: STRINGS

PART 1 - Strings and Substrings

PART 2 - String Formatting

PART 3 - String Formatting Expanded

LAB 4 SIGN-OFF (SHOW INSTRUCTOR)

Practice For Quizzes, Tests, Midterm & Final Exam

CDOT Wiki ^β