Difference between revisions of "OPS435 Python Assignment 1 2017 - 3"

From CDOT Wiki
Jump to: navigation, search
(Instructions)
(Successful requests menu)
Line 47: Line 47:
 
Apache Log Analyser - Main Menu
 
Apache Log Analyser - Main Menu
 
===============================
 
===============================
1) Successful requests
+
1) Successful Requests
2) Failed requests
+
2) Failed Requests
 
q) Quit
 
q) Quit
 
</source>
 
</source>
 +
 +
Make sure the line of equal signs is not hard-coded. You must be able to quickly change the title and not have to update a string with some number of extra or fewer equal signs. You might want to make a function to display this line, and use that function for the other menus as well.
  
 
The "Reading log files" message must display only once when your program starts, not every time the menu is displayed. You may find it easier to code this functionality after you're done writing the code for the menu itself.
 
The "Reading log files" message must display only once when your program starts, not every time the menu is displayed. You may find it easier to code this functionality after you're done writing the code for the menu itself.
Line 57: Line 59:
  
 
the <b>q</b> option is self-explanatory.
 
the <b>q</b> option is self-explanatory.
 +
 +
=== Successful requests menu ===
 +
 +
<source>Apache Log Analyser - Successful Requests Menu
 +
==============================================
 +
1) How many total requests (Code 200)
 +
2) How many requests from Seneca (IPs starting with 142.204)
 +
3) How many requests for isomaster-1.3.13.tar.bz2
 +
q) Return to Main Menu
 +
</source>
 +
 +
Each line in the log file is in the following Apache log format:
 +
 +
<source>LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined</source>
 +
 +
One example line is:
 +
 +
<source>109.86.167.47 - - [29/Aug/2017:10:22:49 -0400] "GET /isomaster/releases/isomaster-1.3.13.tar.bz2 HTTP/1.1" 200 245085 "http://littlesvr.ca/isomaster/download/" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:55.0) Gecko/20100101 Firefox/55.0"</source>
 +
 +
You should use the Python <b>re</b> module, look [https://docs.python.org/3/library/re.html here] for documentation. You may use the following regular expression to extract the components from each line:
 +
 +
<source>([(\d\.)]+) - - \[(.*?)\] "(.*?)" (\d+) (\d+) "(.*?)" "(.*?)"</source>
 +
 +
The questions are self-explanatory, provide answers formatted as you see fit.

Revision as of 20:24, 4 September 2017

Assignment 1 - Parsing a log file

Weight: 15% of the overall grade.

Due Date: Ask your professor for exact date.

Late penalty: 10% per day (including weekends), and the assignment must be completed in order to pass the course.

Overview

Often, system administrators need to analyze log files. This can be done using a paginator such as less when your system has just been set up and/or you're the only user. On a production system it is not unusual to have thousands of legitimate users per month accessing the server's services, plus thousands more bots looking for unpatched vulnerabilities, brute-forcing username/password pairs, or just downloading every available file on your web server.

In this assignment you will create a program that will help you as a Apache server administrator to answer questions about the status, load, and security of your web server. You will not need to set up a mail server for this assignment, though you're welcome to use the one you've set up in OPS335 as a practice machine.

Instructions

Name and Parameters

Your Python3 program will be named check_apache_log.py and it will accept the following parameters:

  • --default or -d as an optional first argument, followed by:
  • filename, or:
  • filename1 filename2 filename3 etc... - any number of filenames from 1 to as many as the command-line supports.

Header

Your program must be a single source file, and at the top of that file it will contain the following true statement as a comment (replace Andrew Smith with your own name):

OPS435 Assignment 1 - Fall 2017
check_apache_log.py
Author: Andrew Smith
The source code in this file (check_apache_log.py) is original work written 
by Andrew Smith and has not been copied from any other source including any
person, textbook, or online resource. I have not shared this work with anyone
or anything except for submission for grading. I understand that the 
Academic Honesty Policy is not a joke and violators will be punished.

Main Menu

Your program will be primarily menu-based instead of parameter-driven. That means the user will ask the program to do something after the program is already running, which is different from a typical command-line tool. When the program starts, it will present the user will the following menu:

Reading log files... done.

Apache Log Analyser - Main Menu
===============================
1) Successful Requests
2) Failed Requests
q) Quit

Make sure the line of equal signs is not hard-coded. You must be able to quickly change the title and not have to update a string with some number of extra or fewer equal signs. You might want to make a function to display this line, and use that function for the other menus as well.

The "Reading log files" message must display only once when your program starts, not every time the menu is displayed. You may find it easier to code this functionality after you're done writing the code for the menu itself.

Both option 1 and 2 will display a new menu.

the q option is self-explanatory.

Successful requests menu

Apache Log Analyser - Successful Requests Menu
==============================================
1) How many total requests (Code 200)
2) How many requests from Seneca (IPs starting with 142.204)
3) How many requests for isomaster-1.3.13.tar.bz2
q) Return to Main Menu

Each line in the log file is in the following Apache log format:

LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined

One example line is:

109.86.167.47 - - [29/Aug/2017:10:22:49 -0400] "GET /isomaster/releases/isomaster-1.3.13.tar.bz2 HTTP/1.1" 200 245085 "http://littlesvr.ca/isomaster/download/" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:55.0) Gecko/20100101 Firefox/55.0"

You should use the Python re module, look here for documentation. You may use the following regular expression to extract the components from each line:

([(\d\.)]+) - - \[(.*?)\] "(.*?)" (\d+) (\d+) "(.*?)" "(.*?)"

The questions are self-explanatory, provide answers formatted as you see fit.