Changes

Jump to: navigation, search

OPS435 Python Lab 4

23,358 bytes added, 22:39, 19 June 2017
no edit summary
= INVESTIGATION 2: STRINGS =
:Strings are in their most basic form a list of characters, or a bit of text. Strings store text so that we they can use them be used laterfor wide range functions. In this This section we will cover more than just displaying that text to the screen. Here, we This investigation will go over discuss cutting strings into sub-strings, joining strings together, formatting strings, searching through strings, and matching strings against patterns. Strings are immutable data objects, this means that once a string is created, it cannot be modified. In order to make a change inside a string, make a copy of the parts(sub-strings) to keep and create a new string with the added values. This is a common and simple procedure that python gives special syntax to accomplish.
== PART 1 - String Basics Strings and Substrings ==This first part will explain basic concepts of using strings, printing strings, and manipulating sub-strings.'''Perform the Following Steps:''':#Launch the ipython3 shell<source>ipython3</source>:#Create strings to manipulate and print<source>course_name = 'Open System Automation'course_code ='OPS435'course_number =435</source>:#Strings can contain any characters inside them, whether they are letters, numbers, or symbols. In our ipython3 shell the values inside each string variable can be seen just by typing the string variable name. However, when writing python scripts, these string variables should be placed inside print() functions in order to display on the screen. Strings can also be concatenated(combined together) by using the '''+''' sign, just make sure string are only concatenating other strings(no lists, no numbers, no dictionaries, etc)<source>course_namecourse_codecourse_numberprint(course_name)print(course_code)print(str(course_number))print(course_name + ' ' + course_code + ' ' + str(course_number))</source>:#Strings can also used special syntax for string repitition by multiplying the string by a number. This will repeat that string that many times. Repitition with '''*''' is useful whenever a string needs to be repeated more than once<source>print(course_name + '-' + course_code)print(course_name + '-'*5 + course_code)print(course_name + '-'*25 + course_code)print('abc'*2)print(course_code*5)</source>:#This can be especiallt usefukl when dealing with special characters '''\n''' is the newline character. Using this in a string will end the line and print on the next line<source>print('Line 1\nLine 2\nLine 3\n')</source>:#By using string repitition on a special newline character, multiple lines can be created at once<source>print('Line 1' + '\n'*4 + 'Line 5\nLine 6')</source>:#Strings have tons of functions built into them and many more we can use on them, take a look at the strings name space and the available functions<source>dir(course_name)help(course_name)<!/source>:#Lets try out several different functions, refer back to the help() function for more information, these are quick ways to view strings in different ways<source>course_name.lower() # Returns a string in lower-case letterscourse_name.upper() # Returns a string in upper-case letterscourse_name.swapcase() # Returns a string with upper-case and lower-case letters swappedcourse_name.title() # Returns a string with upper-case first letter of each word lower on restcourse_name.capitalize() # Returns a string with upper-case first letter in string lowere on rest</source>:We #These values can concatenate be saved inside new strings using and reused for any new tasks<source>lower_name = course_name.lower() # Save returned string lower-case string inside new string variableprint(lower_name)</source>:#If a string contains many values separated by a single character, such as a space, the string can be split on those values and create a list of values<source>lower_name.split(' ') # Provide the plus signsplit() function with a character to split on</source>:#The above will return a list of strings, which we can access just like all of lists<source>list_of_strings = lower_name. Combining split(' ') # Split string on spaces and store the list in a variablelist_of_strings # Display listlist_of_strings[0] # Display first item in list</source>:#Since this list is actually a list of '''strings together ''', any function that works on strings will work on items in the list<source>list_of_strings[0].upper() # Use the functmon after the index to interact with just a single string in the listfirst_word = list_of_strings[0]first_wordprint(first_word)</source>:#The index that is used inside lists is also used to access characters in a string. Take a step back from lists for now, create a brand new string, and start accessing the strings are immutable index.<source>course_name = 'Open System Automation'course_code = 'OPS435'course_number = 435course_code[0] # Return a string that is the first character in course_codecourse_code[2] # Return a string that is the third character in course_codecourse_code[-1] # Return a string that is the last character in course_codestr(course_number)[0] # Turn the integer into a string, return first character in that stringcourse_code[0] + course_code[1] + course_code[2]</source>:#While a list's index is for each item in the list, a string's index is for each character in the string. In a string this is called a substring, taking out a number of characters from the string and using this substring to either create a new string or display only a small portion of it<source>course_name[0:4] # This will return the first four characters NOT including index 4 -> indexes 0,1,2,3 -> but not index 4first_word = course_name[0:4] # Save this substring for later usecourse_code[0:3] # This will return the first three characters NOT including index 3 -> indexes 0,1,2 -> but not index 3</source>:# The index allows some extra functions using the colon and negative numbers<source>course_name = 'Open System Automation'course_name[12:] # Return the substring '12' index until end of stringcourse_name[5:] # Return the substring '5' index until end of stringcourse_name[-1] # Return the last character</source>:#With negative indexes the index works from the right side of the string '''-1''' being the last character, '''-2''' second last character, and so on<source>course_name = 'Open System Automation'course_name[-1]course_name[-2]</source>:# This allows string to return the substrings like below<source>course_name = 'Open System Automation'course_name[-10:] # Return the last ten characterscourse_name[-10:-6] # Try and figure out what this is returning course_name[0:4] + course_name[-10:-6] # Combine substrings togethersubstring = course_name[0:4] + course_name[-10:-6] # Save the combined substring as a new string for latersubstring</source>:# The real power found in substrings goes beyond just like tuplesmanually writing index values and getting back words. The next part of this investigation will cover how to search through a string for a specific word, letter, number, and return the index to that search result. :'''Create a Python Script Demostrating Substrings'''::'''Perform the Following Instructions''':::#Create the '''~/ops435/lab4/lab4d.py''' script. The purpose of this script is to demonstrate creating and manipulating strings. There will be four functions each will return a single string.:::#Use the following template to get started:<source>#!/usr/bin/env python3# Strings 1 str1 = 'Hello World!!'str2 = 'Seneca College' num1 = 1500num2 = 1. This means everytime you change 50 def first_five(): def last_seven(): def middle_number(): def first_three_last_three(): if __name__ == '__main__': print(first_five(str1)) print(first_five(str2)) print(last_seven(str1)) print(last_seven(str2)) print(middle_number(num1)) print(middle_number(num2)) print(first_three_last_three(str1, str2)) print(first_three_last_three(str2, str1))</source>  :::*The first_five() function accepts a single string argument :::*The first_five() function returns a string that contains the first five characters of the argument given :::*The last_seven() function accepts a single string argument :::*The last_seven() function returns a string that contains the last seven characters of the argument given :::*The middle_number() function accepts a integer as a argument:::*The middle_number() function returns a string containing the second and third characters in the number:::*The first_three_last_three() function accepts two string arguments :::*The first_three_last_three() function returns a single string that starts with the first three characters of argument1 and ends with the last three characters of argument2:::*Example: first_three_last_three('abcdefg', '1234567') returns single string'abc567'::::'''Sample Run 1'''<source>run lab4d.py Hello SenecaWorld!!College50.5HellegeSened!!</source>::::'''Sample Run 2(import)'''<source>import lab4dstr1 = 'Hello World!!'str2 = 'Seneca College'num1 = 1500num2 = 1.50lab4d.first_five(str1)'Hello 'lab4d.first_five(str2)'Seneca'lab4d.last_seven(str1)'World!!'lab4d.last_seven(str2)'College'lab4d.middle_number(num1)'50'lab4d.middle_number(num2)'.5'lab4d.first_three_last_three(str1, str2)'Hellege'lab4d.first_three_last_three(str2, str1)'Sened!!'</source>:::3. Exit the ipython3 shell, download the checking script and check your work. Enter the following commands from the bash shell.<source>cd ~/ops435/lab4/pwd #confirm that you are in the right directoryls CheckLab4.py || wget matrix.senecac.on.ca/~acoatley-willis/CheckLab4.pypython3 ./CheckLab4.py -f -v lab4d</source>:::4. Before proceeding, make certain that you identify any and all errors in lab4d.py. When the checking script tells you everything is OK before proceeding to the next step.<br><br> == PART 2 - String Formatting ==In python string concatenation with plus signs and commas is very limited and becomes very messy when when used at length with many values/calculations. This section will cover the format() functions that every string is able to use. The format() function allows for well formatted code, aligning text, and converting values efficiently and cleanly. While this section uses lists and dictionaries, remember that these are actually creating a new lists of strings and dictionaries with stringvalues:'''Perform the Following Steps:''':#Start the ipython3 shell
<source>
str1 ipython3</source>:#To start use the format() function on a string, .format() goes on the end of strings, arguments can be provided to the format() function<source>print( 'College is Great'.format() )</source>:#The above example does not actualy do any formatting, next add a string using the format() function arguments<source>print('College is {}'.format('Great'))</source>:#When format finds '''{}''' curly braces, it performs special actions. If format() finds '''{}''' with nothing inside it substitutes the string from it's arguments. These arguments are done by position left to right<source>print('{} {} {}'.format('a', 'b', 'c'))</source>:#However, using this positional while quick, easy, and clean has a issue, if more curly braces '''{}''' are in the string than in the format() arguments, it will create a error. This next string is going to throw a error<source>print('{} {} {} {}'.format('a', 'b', 'c'))</source>:#For situations like above, if reusing strings more than once is important, positional values can be placed inside the curly braces<source>print('{0} {1} {1} {2}'.format('a', 'b', 'c'))print('{2} {2} {1} {0}'.format('a', 'b', 'c'))</source>:#These positions make formating each string much less prone to errors while also making the string being formatted much more clear. To improve on formatting further, provide the format() function with string variables<source>course_name = 'Open System Automation'course_code = 'OPS435'print('{0} {1} {0}'.format(course_code, course_name))</source>:#The format() function by default tries to use values as strings, all values is '''{}''' are displayed as strings unless extra options are given<source>course_number = 435 # This is an integerprint('This is displaying a string by default: {0}'.format(course_number))</source>:#Next place a list inside the format() function and access the values, there are two ways to use the list here<source>list1 = [1,2,3,4,5] # While this is a list of numbers, format() by default tried to print everything as a str()print('{0[0]} {0[1]} {0[2]} {0[3]} {0[4]}'.format(list1)) # Access the values of the list using indexes {position[index]} print('{0} {1} {2} {3} {4}'.format(*list1)) # Use *list1 to expand the list into multiple positional arguments for format()print('{} {} {} {} {}'.format(*list1)) # Use *list1 to expand the list into multiple positional arguments for format()print('{0} {1} {2} {3} {4}'.format(1,2,3,4,5)) # Use *list1 in this situation is the same as this</source>:#Next place a dictionary inside the format() functions and access the values, again there are a few ways to use the dictionary<source>dict_york = {'Address': '70 The Pond Rd', 'City': 'Toronto', 'Country': 'Canada', 'Postal Code': 'M3J3M6'Paul, 'Province': 'ON'}str2 print('{}'.format(dict_york)) # Print out whole dictionaryprint('{0}'.format(dict_york)) # Print out whole dictionary using format arguments positionprint('{0[City]} {0[Country]}'.format(dict_york)) # Print out values using position and key {position[key]}</source>:#With dictionaries however, instead of using positional arguments '''0''' each access to a key, python allows for expansion of keyword arguments. First take a look at the example of a keyword arguments, place the keyword variable name in between '''{}''' and add the keyword argument to the format() function<source>print('{string1} is {string2} {string2}'.format(string1= 'AtreidesCollege', string2='Great!'))str3 </source>:#Variables may also be passed to these keyword arguments<source>college = 'Seneca College'print('{string1} is {string2} {string2}'.format(string1=college, string2= str1 + ' Great!'))</source>:#Now back to dictionaries, using keyword arguments(sometimes referred to as kwargs in python). The dictionary can be quickly and easily expanded into these keyword arguments using the syntax '''**dictionary_name'''. Pay close attention to the keys inside the dictionary and the values associated with each key<source>dict_york = {'Address': '70 The Pond Rd', 'City': 'Toronto', 'Country': 'Canada', 'Postal Code': 'M3J3M6', 'Province' + str2: 'ON'}str3print('{City} {Province} {Country}'.format(**dict_york)) # Uses the dictionary's keyword argumentsprint('{City} {Province} {Country}'.format(City='Toronto', Province='ON', Country='Canada')) # Creates new keyword arguments
</source>
Repetition is also a useful tool that can be used == PART 3 - String Formatting Expanded ==This next section expands on how to use the format() function by bringing in the ability to work with stringsnumbers and align text. Repetition repeats  :'''Perform the string over and over a specific amount of times. This is useful anytime you would manually be typing Following Steps''':#Start the same thing over again.ipython3 shell
<source>
str1 ipython3</source>:#Start with numbers first. Study the following examples showing the different ways to change the output inside the curly brace '''{}''' while formatting. This formatting change is done by placing a colon inside the curly braces, following by a letter '''{0:f}'''<source>number1 = 50number2 = -50number3 = 1.50print('Paul{0:f}'.format(number1)) # The 0 is the position just like other format() functionsstr2 = print('{0:f}'.format(number2)) # The colon separate the postion/index areas from the extra functionalityprint('{0:f}'.format(number3)) # The f stands for fixed point number</source>:#A fixed point number means that it can control the number of digits that come after the decimal point, try changing the '''.2''' to any other number and experiment<source>print('{0:.0f}'.format(number1)) # Show no digits after decimal pointprint('{0:.2f}'.format(number2)) # Show two digits after decimal pointprint('Atreides{0:.1f}'.format(number3)) # Show one digit after decimal point</source>str3 = str1 :#Numbers can be displayed with the '''-''' or '''+ ''' signs before the digits, this could be important in formatting<source>print('{0: f}'.format(number1)) # Show a space where plus sign should beprint('{0: f}'.format(number2)) # Shows negative sign normallyprint('{0: f}\n{1: f}'.format(number1, number2)) # The space before the f lines up positive and negative numbers</source>:#Placing a '' ' + str2 ''' before the f changes the format so that plus signs show up for positive numbers and negative signs show up for negative numbers<source>print('{0:+ f}' .format(number1)) # Show a space where plus sign should beprint(' {0:+ f}'I.format(number2)) # Shows negative sign normallyprint('{0:+f}\n{1:+f}'.format(number1, number2)) # The space before the f lines up positive and negative numbers</source>str3:#Combining fixed point positions and sign together look like the following<source>str3 = str1 print('{0:+ .2f}' .format(number1))print(' {0:+ str2 .2f}'.format(number2))print('{0:+.2f}\n{1:+.2f}'.format(number1, number2))</source>:#In the event that the numbers being shown are all integers and do no require the decimal values, instead of '''{:f}''' use '''{:d}''' for decimal INTEGERS<source>print('{0:d}'.format(number1))print('{0:d}'.format(number2))print('{0: d}'.format(number1))print('{0: d}'.format(number2))print('{0:+ d}' .format(number1))print(' {0:+ d}'.format(number2))</source>:#When using '''{:d}''' be careful not to use numbers with a decimal value in them, the following will create a error<source>number3 = 1.50print('{0:d}'.format(number3))</source>:#Next lets move on to allignment of text. This is the process of adding padding to the left of our text, the right of our text, or aligning to the centre(padding both left and right). Through alignment the field size of any string can be set. Start by using a string but placeing a number value after the colon '''{0:10}'''<source>string1 = 'hello'string2 = 'world'print('{0:10}{1}'.format(string1, string2)) # Make sure string1 is 10 characters aligned to the leftprint('{0:6}{1}'.format(string1, string2)) # Make sure string1 is 6 characters aligned to the left</source>:#By default format() aligns to the left, the symbol to explicitely do this is '''<''' which looks like '''{0:<10}''' when used in the curely braces. Now whenever a different string is placed inside it always aligns to the left 10 characters. This allows for INCREDIBLY well structured output<source>print('{:<10} {:<10} {:<10}\n{:<10} {:<10} {:<10}'.format('abc', 'def', 'ghi', 'jkl123', 'mno123', 'pqr123')) # Without positional argument numbersprint('{0:<10} {1:<10} {2:<10}\n{3:<10} {4:<10} {5:<10}'.format('abc', 'def', 'ghi', 'jkl123', 'mno123', 'pqr123')) # With positional argument numbers</source>:#Next try using right alignment with the '''>''' symbol replacing the left alignment<source>print('{:>10} {:>10} {:>10}\n{:>10} {:>10} {:>10}'.format('abc', 'def', 'ghi', 'jkl123', 'mno123', 'pqr123')) # Without positional argument numbersprint('{0:>10} {1:>10} {2:>10}\n{3:>10} {4:>10} {5:>10}'.format('abc', 'def', 'ghi', 'jkl123', 'mno123', 'pqr123')) # With positional argument numbers</source>:#Finally center alignment with the '''^''' symbol<source>print('{:^10} {:^10} {:^10}\n{:^10} {:^10} {:^10}'.format('abc', 'def', 'ghi', 'jkl123', 'mno123', 'pqr123')) # Without positional argument numbersprint('{0:^10} {1:^10} {2:^10}\n{3:^10} {4:^10} {5:^10}'.format('abc', 'def', 'ghi', 'jkl123', 'mno123', 'pqr123')) # With positional argument numbers</source>:#The alignment character can be changed to any other character that is wanted for the output. By default it's a space, but it could be anything else by adding an additional character '''{:M<10}''' such as '''M''' or ''I'*''' before the alignment character <source>print('{:*^10} {:*^10} {:*^10}\n{:*^10} {:*^10} {:*^10}'.format('abc', 'def', 'ghi', 'jkl123', 'mno123', 'pqr123')) # Without positional argument numbersprint('{:.^10} {:.^10} {:.^10}\n{:.^10} {:.^10} {:.^10}'.format('abc', 'def', 'ghi', 'jkl123', 'mno123', 'pqr123')) # Without positional argument numbers</source>:#Make the output look better by creating borders, create this function that displays some data enclosed in borders, try to understand what is happening. This function does not return any value, only prints formatted text<source>def print_with_borders(): print('|{:-^10}|'.format('nums')) print('|{:^5}{:^5}|'.format(1,2)) print('|{:^5}{:^5}|'.format(3,4)) print('|{}|'.format('-'*10)) # Run the functionprint_with_borders()</source>:'''Create a Script Demonstrating Formatting Strings'''::'''Perform the Following Instructions:''':::#Create the '''~/ops435/lab4/lab4e.py''' script. The purpose of this script is to demonstrate formatting string output from a large data structure.:::#Use the following template to get started:<source>#!/usr/bin/env python3# Formatted Strings dict_york = {'Address': '70 The Pond Rd', 'City': 'Toronto', 'Country': 'Canada', 'Postal Code': 'M3J3M6', 'Province': 'ON'}college = 'Seneca College' def print_college_address(): # Prints out the keys and values from the dictionary # Formats the output to have a title bar with a title # EXACT SAME output as the samples if __name__ == '__main__': print_college_address(dict_york, college) </source>:::*The print_college_address() function does NOT return anything:::*The print_college_address() function accept two arguments:::*The first argument is a dictionary:::*The second argument is a string:::*The title printed is center aligned by 40 characters using '-' instead of space for alignment:::*The keys printed are center aligned by 20 characters:::*The values printed are center aligned by 20 characters:::*The output must match the sample output EXACTLY if one character is off it will be wrong::::'''Sample Run 1:'''<source>run lab4e.py|-------------Seneca College-------------|| Address 70 The Pond Rd || Province ON || Postal Code M3J3M6 || City Toronto || Country Canada ||----------------------------------------|</source>::::'''Sample Run 2(with import):'''<source>import lab4edict_york = {'Address': '70 The Pond Rd', 'City': 'Toronto', 'Country': 'Canada', 'Postal Code': 'M3J3M6', 'Province': 'ON'}dict_newnham = {'Address': '1750 Finch Ave E', 'City': 'Toronto', 'Country': 'Canada', 'Postal Code': 'M2J2X5', 'Province': 'ON'}college = 'Seneca College'lab4e.print_college_address(dict_york, college)|-------------Seneca College-------------|| Address 70 The Pond Rd || Province ON || Postal Code M3J3M6 || City Toronto || Country Canada ||----------------------------------------|lab4e.print_college_address(dict_newnham, college)|-------------Seneca College-------------|| Address 1750 Finch Ave E || Province ON || Postal Code M2J2X5 || City Toronto || Country Canada ||----------------------------------------|</source>:::3. Exit the ipython3 shell, download the checking script and check your work. Enter the following commands from the bash shell.<source>cd ~/ops435/lab4/pwd #confirm that you are in the right directoryls CheckLab4.py || wget matrix.senecac.on.ca/~acoatley-willis/CheckLab4.pystr3python3 ./CheckLab4.py -f -v lab4e
</source>
--:::4. Before proceeding, make certain that you identify any and all errors in lab4e.py. When the checking script tells you everything is OK before proceeding to the next step.<br><br>== PART 2 - String Manipulation ==
== PART 3 - Regular Expressions ==
= LAB 4 SIGN-OFF (SHOW INSTRUCTOR) =
198
edits

Navigation menu