Difference between revisions of "WhySoSerial?"
(Created page with "=== Assignment 1 === === File Compression === === Dictionary Translation === === Image Processing ===") |
(→Assignment 1) |
||
Line 1: | Line 1: | ||
=== Assignment 1 === | === Assignment 1 === | ||
+ | |||
+ | WordProcessor | ||
+ | - The Application | ||
+ | This application was built for assignment one. It is a sort of string editor/ translator was made. It accepts files to be processed by taking words from the file itself and using "codex" or dictionary, to match them to specific meanings or words. Then it replaces the string for that line and moves on to the next. Its intended purpose is to take large files and translate them into other languages. I believe this can be a highly parallelizable problem as many of the steps seem that they can be done in synchronous. | ||
+ | The parts to this application are as follows; | ||
+ | The codex - a file with word pairs separated by a delimiter | ||
+ | The book/file - a file that will be parsed and 'translated' | ||
+ | WordProcessor - the application that performs a tokenization of the input string, mapping of the codex, and replacing of string to file. | ||
+ | |||
+ | -Issues | ||
+ | The main concern with this program is that often times it spends much of its time is looking something up and comparing it to each token generated by a split. This can be quite time consuming as it then has to go through a list of translations to find a matching pair. To increase speeds further, I had implemented the codex as a map. | ||
+ | |||
+ | -Run-time Analysis Big-O | ||
+ | split() | ||
+ | n : Linear | ||
+ | lookup() - Composite function, has 3 separate function calls | ||
+ | n^2(log n) : Log Quadratic worst/ Log Linear best | ||
+ | replace() | ||
+ | n : The complexity of string::Replace is linear according to cplusplus.com/reference/string/string/replace/ | ||
+ | |||
+ | Replace and Split were added to this analysis because they may be improved in the future in order to add features to the application itself. This will more than likely increase their complexity and therefore its runtimes. In addition, these functions will be kept on watch as they are called from lookup(). | ||
+ | -Profile Results | ||
+ | gprof - More profiles can be found at https://github.com/Pooch11/DPS915/tree/master/TestFiles/Profiles | ||
+ | |||
+ | Each sample counts as 0.01 seconds. | ||
+ | % cumulative self self total | ||
+ | time seconds seconds calls ns/call ns/call name | ||
+ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ||
+ | 60.00 0.06 0.06 '''WordProcessor::lookup'''(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) | ||
+ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ||
+ | 30.00 0.09 0.03 472125 63.54 71.53 [''Split''] void std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >::_M_emplace_back_aux<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&&&) | ||
+ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ||
+ | 10.00 0.10 0.01 1252187 7.99 7.99 [''mapping''] std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::_Select1st<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >::operator=(std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::_Select1st<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > const&) | ||
+ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ||
+ | 0.00 0.10 0.00 112230 0.00 0.00 '''WordProcessor::replace'''(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) | ||
+ | |||
+ | Highlighting one of the profile; a significant amount of time can be observed in one particular method - lookup. It belongs to the class WordProcessor. | ||
+ | |||
+ | Looking at the overall test data, this is a very prominent trend. Lookup is always the most resource intensive method. This confirms assumptions made for Big-O regarding this funtion | ||
+ | -Profile Overview | ||
+ | [[File:https://github.com/Pooch11/DPS915/blob/master/TestFiles/Profiles/Profilesataglance.jpg]] | ||
+ | |||
+ | -Amdahls Predicted speed increase | ||
+ | Given the formula S(n) = 1 / 1 - P + P / N | ||
+ | P = the percentage portion of parallizable code | ||
+ | N = the number of cores to distribute on | ||
+ | S(n) = the final speed increase as a multiplicative | ||
+ | |||
+ | For this analysis - the Nvidia 1070 GPU is available for testing and has a total of 1920 CUDA cores. | ||
+ | S(n) = 1 / 1 - (0.6667) + 0.6667 / 1920 | ||
+ | = 1 / 0.3333 + 0.00034723 | ||
+ | = 1 / 0.333647239 | ||
+ | = 2.99x | ||
+ | Therefore by Amdahl's law, there is a potential speed increase of 2.99x if this component were to be paralleled. | ||
+ | |||
+ | -Additional Info Available @ | ||
+ | - [https://github.com/Pooch11/DPS915] GitHub Link | ||
=== File Compression === | === File Compression === |
Revision as of 04:25, 25 February 2017
Assignment 1
WordProcessor - The Application This application was built for assignment one. It is a sort of string editor/ translator was made. It accepts files to be processed by taking words from the file itself and using "codex" or dictionary, to match them to specific meanings or words. Then it replaces the string for that line and moves on to the next. Its intended purpose is to take large files and translate them into other languages. I believe this can be a highly parallelizable problem as many of the steps seem that they can be done in synchronous. The parts to this application are as follows; The codex - a file with word pairs separated by a delimiter The book/file - a file that will be parsed and 'translated' WordProcessor - the application that performs a tokenization of the input string, mapping of the codex, and replacing of string to file.
-Issues The main concern with this program is that often times it spends much of its time is looking something up and comparing it to each token generated by a split. This can be quite time consuming as it then has to go through a list of translations to find a matching pair. To increase speeds further, I had implemented the codex as a map.
-Run-time Analysis Big-O split() n : Linear lookup() - Composite function, has 3 separate function calls n^2(log n) : Log Quadratic worst/ Log Linear best replace() n : The complexity of string::Replace is linear according to cplusplus.com/reference/string/string/replace/
Replace and Split were added to this analysis because they may be improved in the future in order to add features to the application itself. This will more than likely increase their complexity and therefore its runtimes. In addition, these functions will be kept on watch as they are called from lookup(). -Profile Results gprof - More profiles can be found at https://github.com/Pooch11/DPS915/tree/master/TestFiles/Profiles
Each sample counts as 0.01 seconds.
% cumulative self self total time seconds seconds calls ns/call ns/call name
60.00 0.06 0.06 WordProcessor::lookup(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&)
30.00 0.09 0.03 472125 63.54 71.53 [Split] void std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >::_M_emplace_back_aux<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&&&)
10.00 0.10 0.01 1252187 7.99 7.99 [mapping] std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::_Select1st<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >::operator=(std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::_Select1st<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > const&)
0.00 0.10 0.00 112230 0.00 0.00 WordProcessor::replace(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&)
Highlighting one of the profile; a significant amount of time can be observed in one particular method - lookup. It belongs to the class WordProcessor.
Looking at the overall test data, this is a very prominent trend. Lookup is always the most resource intensive method. This confirms assumptions made for Big-O regarding this funtion -Profile Overview File:Https://github.com/Pooch11/DPS915/blob/master/TestFiles/Profiles/Profilesataglance.jpg
-Amdahls Predicted speed increase Given the formula S(n) = 1 / 1 - P + P / N P = the percentage portion of parallizable code N = the number of cores to distribute on S(n) = the final speed increase as a multiplicative
For this analysis - the Nvidia 1070 GPU is available for testing and has a total of 1920 CUDA cores. S(n) = 1 / 1 - (0.6667) + 0.6667 / 1920
= 1 / 0.3333 + 0.00034723 = 1 / 0.333647239 = 2.99x
Therefore by Amdahl's law, there is a potential speed increase of 2.99x if this component were to be paralleled.
-Additional Info Available @ - [1] GitHub Link