Open main menu

CDOT Wiki β

Changes

GPU610/DPS915 Team 7 Project Page

317 bytes added, 10:53, 29 March 2018
Assignment 3
Running the tests with the various block sizes revealed that the optimal block size is 16 x 16, timings below. The timing for the kernel has gone down from 179 ms to 106 ms. Observation: Although block sizes of 16 x 16 and 32 x 32 both have 100% thread occupancy, the 16 x 16 turned out to be optimal. This is probably due to the fact that less warps are competing for access to global memory.
[[File:DPS915 Team7 Block size 16 x 16.PNG]]
 
 
The following image shows the final optimized kernel:
 
[[File:DPS915 Team7 RotatePixels2.PNG]]
100
edits