1
edit
Changes
Carlos
,→Progress
#include <iostream>
}
---- Profiling Results for the summedAreaTable() function ------
250 1.50
500 25.87
=== Assignment 2 ===
Here is my code to parallelize the SAT algorithm I worked on A1:
<pre>
#include <iostream>
}
---- Profiling Results for the summedAreaTable() function ------
Word Problem A1/CPU (Seconds) A2/GPU (Seconds)
100 0.03 0.0034
200 0.61 0.0445
300 3.08 0.2124
400 9.66 0.6549
500 24.17 1.58
600 54.4 3.268
700 113.17 5.976
--------------------------------------------------------------------
</pre>
=== Assignment 3 ===
Due to the difficulty of optimizing the code provided in A2 to parallelize a Summed Area Table, The professor and I accorded for me to only provide an explanation of how would I optimize my code by using a prefix sum method: http://http.developer.nvidia.com/GPUGems3/gpugems3_ch39.html.