Thunderbird
Team Members
eMail All
Progress
Assignment 1
Profiling: LZW algorithm
Profiling: Ray-Tracing algorithm
Analysis
Ray-Tracing Algorithm
Assignment 2
Parallelize
Perfomance
Assignment 3
Optimization
global vs constant
Perfomance
Conclusion
Profiling: LZW algorithm
It's a simple version of LZW compression algorithm with 12 bit codes.
void compress(string input, int size, string filename) { unordered_map<string, int> compress_dictionary(MAX_DEF); //Dictionary initializing with ASCII for ( int unsigned i = 0 ; i < 256 ; i++ ){ compress_dictionary[string(1,i)] = i; } string current_string; unsigned int code; unsigned int next_code = 256; //Output file for compressed data ofstream outputFile; outputFile.open(filename + ".lzw"); for(char& c: input){ current_string = current_string + c; if ( compress_dictionary.find(current_string) ==compress_dictionary.end() ){ if (next_code <= MAX_DEF) compress_dictionary.insert(make_pair(current_string, next_code++)); current_string.erase(current_string.size()-1); outputFile << convert_int_to_bin(compress_dictionary[current_string]); current_string = c; } } if (current_string.size()) outputFile << convert_int_to_bin(compress_dictionary[current_string]); outputFile.close(); }
Using compiler settings (gcc version 5.2.0):
g++ -c -O2 -g -pg -std=c++14 lzw.cpp
10 MB text file
wlee64@matrix:~/gpu610/assignments/a1> time lzw -c 10.txt real 0m4.302s user 0m3.072s sys 0m0.632s
Flat profile: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls ns/call ns/call name 45.83 0.55 0.55 compress(string, int, string) 36.67 0.99 0.44 14983735 29.37 29.37 _M_find_before_node(unsigned int, string const&, unsigned int) const 7.50 1.08 0.09 10489603 8.58 8.58 show_usage() 5.83 1.15 0.07 4493878 15.58 44.94 operator[](string const&) 4.17 1.20 0.05 _Z22convert_char_to_stringB5cxx11PKci 0.00 1.20 0.00 4097 0.00 0.00 _M_insert_unique_node(unsigned int, unsigned int, std::__detail::_Hash_node<std::pair<string const, int>, true>*) 0.00 1.20 0.00 3841 0.00 29.37 _ZNSt10_HashtableINSt7 0.00 1.20 0.00 1 0.00 0.00 _GLOBAL__sub_I__Z18convert_int_to_binB5cxx11i 0.00 1.20 0.00 1 0.00 0.00 ~_Hashtable()
20 MB text file
wlee64@matrix:~/gpu610/assignments/a1> time lzw -c 20.txt real 0m8.924s user 0m6.504s sys 0m2.008s
Flat profile: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls ns/call ns/call name 49.33 1.47 1.47 compress(string, int, string) 34.56 2.50 1.03 29962271 34.38 34.38 _M_find_before_node(unsigned int, string const&, unsigned int) const 7.05 2.71 0.21 8986654 23.37 57.74 operator[](string const&) 6.71 2.91 0.20 20975363 9.53 9.53 show_usage() 2.35 2.98 0.07 _Z22convert_char_to_stringB5cxx11PKci 0.00 2.98 0.00 4097 0.00 0.00 _M_insert_unique_node(unsigned int, unsigned int, std::__detail::_Hash_node<std::pair<string const, int>, true>*) 0.00 2.98 0.00 3841 0.00 34.38 _ZNSt10_HashtableINSt7 0.00 2.98 0.00 1 0.00 0.00 _GLOBAL__sub_I__Z18convert_int_to_binB5cxx11i 0.00 2.98 0.00 1 0.00 0.00 ~_Hashtable()
30 MB text file
wlee64@matrix:~/gpu610/assignments/a1> time lzw -c 30.txt real 0m13.637s user 0m9.665s sys 0m2.984s
Flat profile: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls ns/call ns/call name 45.59 1.86 1.86 compress(string, int, string) 37.25 3.38 1.52 44940806 33.82 33.82 _M_find_before_node(unsigned int, string const&, unsigned int) const 7.60 3.69 0.31 13479429 23.00 56.82 operator[](string const&) 6.62 3.96 0.27 31461123 8.58 8.58 show_usage() 2.94 4.08 0.12 _Z22convert_char_to_stringB5cxx11PKci 0.00 4.08 0.00 4097 0.00 0.00 _M_insert_unique_node(unsigned int, unsigned int, std::__detail::_Hash_node<std::pair<string const, int>, true>*) 0.00 4.08 0.00 3841 0.00 33.82 _ZNSt10_HashtableINSt7 0.00 4.08 0.00 1 0.00 0.00 _GLOBAL__sub_I__Z18convert_int_to_binB5cxx11i 0.00 4.08 0.00 1 0.00 0.00 ~_Hashtable()
Profiling: Ray-tracing algorithm
Source Code: https://www.scratchapixel.com/lessons/3d-basic-rendering/introduction-to-ray-tracing/ray-tracing-practical-example
Profiling: bubble vs quick algorithm
It's a simple version of sorting algorithm - Source code
void BubbleSort(int arr[], int size) { int tmp; /*used for swapping*/ int i; int j; for (i = 0; i<size - 1; i++) { for (j = 0; j<size - 1 - i; j++) { if (arr[j + 1] < arr[j]) { /* compare the two neighbors */ tmp = arr[j]; /* swap a[j] and a[j+1] */ arr[j] = arr[j + 1]; arr[j + 1] = tmp; } } } } void InsertionSort(int arr[], int left, int right) { int curr; int i, j; for (i = left + 1; i <= right; i++) { curr = arr[i]; for (j = i; j>0 && arr[j - 1] > curr; j--) { arr[j] = arr[j - 1]; } arr[j] = curr; } } void QuickSort(int arr[], int left, int right) { if (right - left <= 3) { InsertionSort(arr, left, right); } else { int pivotpt = (left + right) / 2; //index of the pivot int i = left; int j = right - 1; swap(arr[pivotpt], arr[right]); pivotpt = right; int pivot = arr[pivotpt]; while (i<j) { while (i< right - 1 && arr[i]<pivot) i++; while (j > 0 && arr[j]>pivot) j--; if (i<j) swap(arr[i++], arr[j--]); } if (i == j && arr[i] < arr[pivotpt]) i++; swap(arr[i], arr[pivotpt]); QuickSort(arr, left, i - 1); QuickSort(arr, i + 1, right); } } void QuickSort(int arr[], int size) { QuickSort(arr, 0, size - 1); }
Using compiler settings (gcc version 5.2.0):
g++ -c -O2 -g -pg -std=c++14 a1.cpp
Sorting for 50000
Flat profile: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls ns/call ns/call name 99.77 4.42 4.42 BubbleSort(int*, int) 0.23 4.43 0.01 QuickSort(int*, int, int) 0.00 4.43 0.00 16651 0.00 0.00 InsertionSort(int*, int, int) 0.00 4.43 0.00 1 0.00 0.00 _GLOBAL__sub_I__Z10BubbleSortPii
Sorting for 100000
Flat profile: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls ns/call ns/call name 99.89 17.84 17.84 BubbleSort(int*, int) 0.06 17.85 0.01 33355 299.81 299.81 InsertionSort(int*, int, int) 0.06 17.86 0.01 QuickSort(int*, int, int) 0.00 17.86 0.00 1 0.00 0.00 _GLOBAL__sub_I__Z10BubbleSortPii
Sorting for 200000
Flat profile: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls ns/call ns/call name 99.97 71.90 71.90 BubbleSort(int*, int) 0.01 71.91 0.01 66645 150.05 150.05 InsertionSort(int*, int, int) 0.01 71.92 0.01 QuickSort(int*, int, int) 0.00 71.92 0.00 1 0.00 0.00 _GLOBAL__sub_I__Z10BubbleSortPii