Difference between revisions of "Thunderbird"

From CDOT Wiki
Jump to: navigation, search
(Ray-Tracing Algorithm)
(1. Optimize)
 
(6 intermediate revisions by 2 users not shown)
Line 110: Line 110:
 
[[File:Profiling_Raytrace.png]]
 
[[File:Profiling_Raytrace.png]]
  
==== Analysis ====
+
 
===== Ray-Tracing Algorithm =====
+
==== '''Ray-Tracing Algorithm''' ====
 +
 
 
[[File:rt_1.png]]
 
[[File:rt_1.png]]
====== Ray-Spear Intersection ======
+
 
 +
==== '''Ray-sphere Intersection''' ====
 +
 
 
[[File:rt_2.png]]
 
[[File:rt_2.png]]
====== Trace ray ======
+
 
 +
==== '''Trace''' ====
 +
 
 
[[File:rt_3.png]]
 
[[File:rt_3.png]]
====== Floating error ======
+
 
[[File:rt_4.png]]
+
==== '''Floating-Point Considerations''' ====
 +
 
 +
[[File:Raytrace_floatingerror.PNG ‎]]
 
----
 
----
  
Line 126: Line 133:
  
  
[[File:render_CvsP.png]]
+
[[File:Render_CvsP2.png]]
  
  
Line 132: Line 139:
  
  
[[File:main_CvsP.png]]
+
[[File:main_CvsP2.png]]
 +
 
 
==== 2. Performance ====
 
==== 2. Performance ====
 
[[File:Data_CvsP.PNG]]
 
[[File:Data_CvsP.PNG]]
Line 146: Line 154:
  
  
[[File:PvsO.png]]
+
[[File:PvsO2.png]]
  
 
==== 2. Performance ====
 
==== 2. Performance ====
Line 154: Line 162:
 
[[File:Graph_PvsO.PNG]]
 
[[File:Graph_PvsO.PNG]]
  
 +
 +
==== 3. GPU Occupancy ====
 +
[[File:rt_5.png]]
 
----
 
----
  

Latest revision as of 08:53, 10 April 2017

Thunderbird

Team Members

  1. Sanghun Kim
  2. Wonho Lee
eMail All

Progress

Assignment 1

Profiling: LZW algorithm

It's a simple version of LZW compression algorithm with 12 bit codes.

 void compress(string input, int size, string filename) {
   unordered_map<string, int> compress_dictionary(MAX_DEF);
     //Dictionary initializing with ASCII
     for ( int unsigned i = 0 ; i < 256 ; i++ ){
       compress_dictionary[string(1,i)] = i;
     }
     string current_string;
     unsigned int code;
     unsigned int next_code = 256;
     //Output file for compressed data
     ofstream outputFile;
     outputFile.open(filename + ".lzw");
 
     for(char& c: input){
     current_string = current_string + c;
     if ( compress_dictionary.find(current_string) ==compress_dictionary.end() ){
             if (next_code <= MAX_DEF)
                 compress_dictionary.insert(make_pair(current_string, next_code++));
             current_string.erase(current_string.size()-1);
             outputFile << convert_int_to_bin(compress_dictionary[current_string]);
             current_string = c;
         }   
     }   
     if (current_string.size())
             outputFile << convert_int_to_bin(compress_dictionary[current_string]);
     outputFile.close();
 }

Using compiler settings (gcc version 5.2.0):

 g++ -c -O2 -g -pg -std=c++14 lzw.cpp

10 MB text file

 wlee64@matrix:~/gpu610/assignments/a1> time lzw -c 10.txt
 real	0m4.302s
 user	0m3.072s
 sys	0m0.632s
 Flat profile:
 
 Each sample counts as 0.01 seconds.
   %   cumulative   self              self     total           
  time   seconds   seconds    calls  ns/call  ns/call  name    
  45.83      0.55     0.55                             compress(string, int, string) 
  36.67      0.99     0.44 14983735    29.37    29.37  _M_find_before_node(unsigned int, string const&, unsigned int) const 
   7.50      1.08     0.09 10489603     8.58     8.58  show_usage() 
   5.83      1.15     0.07  4493878    15.58    44.94  operator[](string const&)  
   4.17      1.20     0.05                             _Z22convert_char_to_stringB5cxx11PKci  
   0.00      1.20     0.00     4097     0.00     0.00  _M_insert_unique_node(unsigned int, unsigned int, std::__detail::_Hash_node<std::pair<string const, int>, true>*)  
   0.00      1.20     0.00     3841     0.00    29.37  _ZNSt10_HashtableINSt7  
   0.00      1.20     0.00        1     0.00     0.00  _GLOBAL__sub_I__Z18convert_int_to_binB5cxx11i
   0.00      1.20     0.00        1     0.00     0.00  ~_Hashtable()

20 MB text file

 wlee64@matrix:~/gpu610/assignments/a1> time lzw -c 20.txt
 real	0m8.924s
 user	0m6.504s
 sys	0m2.008s
 Flat profile:
 
 Each sample counts as 0.01 seconds.
   %   cumulative   self              self     total           
  time   seconds   seconds    calls  ns/call  ns/call  name    
  49.33      1.47     1.47                             compress(string, int, string)
  34.56      2.50     1.03 29962271    34.38    34.38  _M_find_before_node(unsigned int, string const&, unsigned int) const
   7.05      2.71     0.21  8986654    23.37    57.74  operator[](string const&)
   6.71      2.91     0.20 20975363     9.53     9.53  show_usage()
   2.35      2.98     0.07                             _Z22convert_char_to_stringB5cxx11PKci
   0.00      2.98     0.00     4097     0.00     0.00  _M_insert_unique_node(unsigned int, unsigned int, std::__detail::_Hash_node<std::pair<string const, int>, true>*)
   0.00      2.98     0.00     3841     0.00    34.38  _ZNSt10_HashtableINSt7
   0.00      2.98     0.00        1     0.00     0.00  _GLOBAL__sub_I__Z18convert_int_to_binB5cxx11i
   0.00      2.98     0.00        1     0.00     0.00  ~_Hashtable()

30 MB text file

 wlee64@matrix:~/gpu610/assignments/a1> time lzw -c 30.txt
 real	0m13.637s
 user	0m9.665s
 sys	0m2.984s
 Flat profile:
 
 Each sample counts as 0.01 seconds.
   %   cumulative   self              self     total           
  time   seconds   seconds    calls  ns/call  ns/call  name    
  45.59      1.86     1.86                             compress(string, int, string)
  37.25      3.38     1.52 44940806    33.82    33.82  _M_find_before_node(unsigned int, string const&, unsigned int) const
   7.60      3.69     0.31 13479429    23.00    56.82  operator[](string const&)
   6.62      3.96     0.27 31461123     8.58     8.58  show_usage()
   2.94      4.08     0.12                             _Z22convert_char_to_stringB5cxx11PKci
   0.00      4.08     0.00     4097     0.00     0.00  _M_insert_unique_node(unsigned int, unsigned int, std::__detail::_Hash_node<std::pair<string const, int>, true>*)
   0.00      4.08     0.00     3841     0.00    33.82  _ZNSt10_HashtableINSt7
   0.00      4.08     0.00        1     0.00     0.00  _GLOBAL__sub_I__Z18convert_int_to_binB5cxx11i
   0.00      4.08     0.00        1     0.00     0.00  ~_Hashtable()

Profiling: Ray-tracing algorithm

Source Code: https://github.com/ksanghun/CUDA_raytrace/blob/master/GPUAssaginemt/cputest.cpp

Profiling Raytrace.png


Ray-Tracing Algorithm

Rt 1.png

Ray-sphere Intersection

Rt 2.png

Trace

Rt 3.png

Floating-Point Considerations

Raytrace floatingerror.PNG


Assignment 2

1. Parallelize

- render()


Render CvsP2.png


- main()


Main CvsP2.png

2. Performance

Data CvsP.PNG


Graph CvsP.PNG


Assignment 3

1. Optimize

- Global to constant memory


PvsO2.png

2. Performance

Data PvsO.PNG


Graph PvsO.PNG


3. GPU Occupancy

Rt 5.png


Conclusion

1. Output

Video: https://youtu.be/3wV-ObHWZhg

2. Performance

Graph CvsPvsO.PNG