1
edit
Changes
→Assignment 2
}
=== Summary ===
For the rotation code, at around 500x600 dimensions or 300,000 pixels, the speed is about the same. However, as the image size increases, the scalar code will become much slower in comparison to the GPU code. To parallelize the code, I just used the straight forward tactic of unrolling the two for loops and assigning one thread for what would be each iteration. Since each index of the array was looked at individually, there is no problem doing that. As you can see in the benchmarks, through this process the speed was reduced greatly at higher resolutions.
== Assignment 3 ==