Changes

Jump to: navigation, search

GPU610/DPS915 Team 7 Project Page

No change in size, 10:56, 29 March 2018
Assignment 3
*Considered using constant memory for the source array, however, because of the max limitation of constant memory, which is 65536, I was not able to allocate enough space to accommodate for large images.
*There is one if-statement in the kernel ('''if (inBounds(r1, c1, maxRows, maxCols))''') that has the potential for thread divergence. However, it is not possible to eliminate this if-statement as it would result in memory access exceptions.
*Tried using pre-fetching by changing this statement ('''dst[r1 * maxCols + c1] = src[r * maxCols + c];''') to '''dst[r1 * '''maxCols + c1] = srcVal;''' and adding '''int srcVal = src[r * maxCols + c];''' before other statements. However, this did not make any timing improvements.
100
edits

Navigation menu