Open main menu

CDOT Wiki β

Thunderbird

Revision as of 08:51, 10 April 2017 by Wlee64 (talk | contribs) (1. Parallelize)

Thunderbird

Team Members

  1. Sanghun Kim
  2. Wonho Lee
eMail All

Progress

Assignment 1

Profiling: LZW algorithm

It's a simple version of LZW compression algorithm with 12 bit codes.

 void compress(string input, int size, string filename) {
   unordered_map<string, int> compress_dictionary(MAX_DEF);
     //Dictionary initializing with ASCII
     for ( int unsigned i = 0 ; i < 256 ; i++ ){
       compress_dictionary[string(1,i)] = i;
     }
     string current_string;
     unsigned int code;
     unsigned int next_code = 256;
     //Output file for compressed data
     ofstream outputFile;
     outputFile.open(filename + ".lzw");
 
     for(char& c: input){
     current_string = current_string + c;
     if ( compress_dictionary.find(current_string) ==compress_dictionary.end() ){
             if (next_code <= MAX_DEF)
                 compress_dictionary.insert(make_pair(current_string, next_code++));
             current_string.erase(current_string.size()-1);
             outputFile << convert_int_to_bin(compress_dictionary[current_string]);
             current_string = c;
         }   
     }   
     if (current_string.size())
             outputFile << convert_int_to_bin(compress_dictionary[current_string]);
     outputFile.close();
 }

Using compiler settings (gcc version 5.2.0):

 g++ -c -O2 -g -pg -std=c++14 lzw.cpp

10 MB text file

 wlee64@matrix:~/gpu610/assignments/a1> time lzw -c 10.txt
 real	0m4.302s
 user	0m3.072s
 sys	0m0.632s
 Flat profile:
 
 Each sample counts as 0.01 seconds.
   %   cumulative   self              self     total           
  time   seconds   seconds    calls  ns/call  ns/call  name    
  45.83      0.55     0.55                             compress(string, int, string) 
  36.67      0.99     0.44 14983735    29.37    29.37  _M_find_before_node(unsigned int, string const&, unsigned int) const 
   7.50      1.08     0.09 10489603     8.58     8.58  show_usage() 
   5.83      1.15     0.07  4493878    15.58    44.94  operator[](string const&)  
   4.17      1.20     0.05                             _Z22convert_char_to_stringB5cxx11PKci  
   0.00      1.20     0.00     4097     0.00     0.00  _M_insert_unique_node(unsigned int, unsigned int, std::__detail::_Hash_node<std::pair<string const, int>, true>*)  
   0.00      1.20     0.00     3841     0.00    29.37  _ZNSt10_HashtableINSt7  
   0.00      1.20     0.00        1     0.00     0.00  _GLOBAL__sub_I__Z18convert_int_to_binB5cxx11i
   0.00      1.20     0.00        1     0.00     0.00  ~_Hashtable()

20 MB text file

 wlee64@matrix:~/gpu610/assignments/a1> time lzw -c 20.txt
 real	0m8.924s
 user	0m6.504s
 sys	0m2.008s
 Flat profile:
 
 Each sample counts as 0.01 seconds.
   %   cumulative   self              self     total           
  time   seconds   seconds    calls  ns/call  ns/call  name    
  49.33      1.47     1.47                             compress(string, int, string)
  34.56      2.50     1.03 29962271    34.38    34.38  _M_find_before_node(unsigned int, string const&, unsigned int) const
   7.05      2.71     0.21  8986654    23.37    57.74  operator[](string const&)
   6.71      2.91     0.20 20975363     9.53     9.53  show_usage()
   2.35      2.98     0.07                             _Z22convert_char_to_stringB5cxx11PKci
   0.00      2.98     0.00     4097     0.00     0.00  _M_insert_unique_node(unsigned int, unsigned int, std::__detail::_Hash_node<std::pair<string const, int>, true>*)
   0.00      2.98     0.00     3841     0.00    34.38  _ZNSt10_HashtableINSt7
   0.00      2.98     0.00        1     0.00     0.00  _GLOBAL__sub_I__Z18convert_int_to_binB5cxx11i
   0.00      2.98     0.00        1     0.00     0.00  ~_Hashtable()

30 MB text file

 wlee64@matrix:~/gpu610/assignments/a1> time lzw -c 30.txt
 real	0m13.637s
 user	0m9.665s
 sys	0m2.984s
 Flat profile:
 
 Each sample counts as 0.01 seconds.
   %   cumulative   self              self     total           
  time   seconds   seconds    calls  ns/call  ns/call  name    
  45.59      1.86     1.86                             compress(string, int, string)
  37.25      3.38     1.52 44940806    33.82    33.82  _M_find_before_node(unsigned int, string const&, unsigned int) const
   7.60      3.69     0.31 13479429    23.00    56.82  operator[](string const&)
   6.62      3.96     0.27 31461123     8.58     8.58  show_usage()
   2.94      4.08     0.12                             _Z22convert_char_to_stringB5cxx11PKci
   0.00      4.08     0.00     4097     0.00     0.00  _M_insert_unique_node(unsigned int, unsigned int, std::__detail::_Hash_node<std::pair<string const, int>, true>*)
   0.00      4.08     0.00     3841     0.00    33.82  _ZNSt10_HashtableINSt7
   0.00      4.08     0.00        1     0.00     0.00  _GLOBAL__sub_I__Z18convert_int_to_binB5cxx11i
   0.00      4.08     0.00        1     0.00     0.00  ~_Hashtable()

Profiling: Ray-tracing algorithm

Source Code: https://github.com/ksanghun/CUDA_raytrace/blob/master/GPUAssaginemt/cputest.cpp

 


Ray-Tracing Algorithm

 

Ray-sphere Intersection

 

Trace

 

Floating-Point Considerations

 


Assignment 2

1. Parallelize

- render()


 


- main()


 

2. Performance

 


 


Assignment 3

1. Optimize

- Global to constant memory


 

2. Performance

 


 


3. GPU Occupancy

 


Conclusion

1. Output

Video: https://youtu.be/3wV-ObHWZhg

2. Performance