Changes

Jump to: navigation, search

WhySoSerial?

482 bytes added, 05:56, 25 February 2017
no edit summary
=== Assignment 1 ===
'''WordProcessor'''
== - The Application ==
== '''WordProcessor''' ==
This application was built for assignment one. It is a sort of string editor/ translator was made. It accepts files to be processed by taking words from the file itself and using "codex" or dictionary, to match them to specific meanings or words. Then it replaces the string for that line and moves on to the next. Its intended purpose is to take large files and translate them into other languages. I believe this can be a highly parallelizable problem as many of the steps seem that they can be done in synchronous.
This application was built for assignment one. It is a sort of string editor/ translator was made. It accepts files to be processed by taking words from the file itself and using "codex" or dictionary, to match them to specific meanings or words. Then it replaces the string for that line and moves on to the next. Its intended purpose is to take large files and translate them into other languages. I believe this can be a highly parallelizable problem as many of the steps seem that they can be done in synchronous.
The parts to this application are as follows;
 
The codex - a file with word pairs separated by a delimiter
 
The book/file - a file that will be parsed and 'translated'
 WordProcessor - the application that performs a tokenization of the input string, mapping of the codex, and replacing of string to file.In future, upgrades will allow WordProcessor to take in multiple files at a time and generate a translation 
Highlighting one of the profiles; a significant amount of time can be observed in one particular method - ''lookup()''. It belongs to the class WordProcessor and consumes most of the lifetime of the application. It is provided each line of the file, and sends off additional calls to help parse, tokenize and replace strings within the given line.
Looking at the overall test data, this is a very prominent trend. ''Lookup()'' is always the most resource intensive method. This confirms assumptions made for Big-O regarding this funtionfunction
-Profile Overview
[[File:Profilesataglance.jpg]]
For this analysis - the Nvidia 1070 GPU is available for testing and has a total of 1920 CUDA cores.
 
- ~2/3 of time spent was deduced by taking a weighted average of results. Files that had low byte sizes or less lines of text were not taken into consideration
 
- interesting to note that when a block of text was combined as one line , 52% of time was spent splitting the string, and still ~47% of time was spent in lookup()
S(n) = 1 / 1 - (0.6667) + 0.6667 / 1920
76
edits

Navigation menu