100
edits
Changes
→Assignment 3
'''The techniques that improved timings are discussed below:'''
Changed access to matrix of pixels in global memory from column-major to row-major so that memory accessed is coalesced. This resulted in timing improvements as shown below. The timing for the kernel has gone down from 349 ms to 206 ms.