Open main menu

CDOT Wiki β

Changes

GPU610/DPS915 Team 7 Project Page

610 bytes added, 16:45, 31 March 2018
Assignment 3
*Using shared memory does not help since '''dst''' is only being assigned to once and '''src''' is only being accessed once.
*Considered using constant memory for the source array, however, because of the max limitation of constant memory, which is 65536, I was not able to allocate enough space to accommodate for large images.
*There is one if-statement in the kernel ('''if (inBounds(r1, c1, maxRows, maxCols))''') that has the potential for thread divergence. Removing that if-statement, however, for angles that are not multiples of 90 degrees, will result in a memory access exception as some of the pixels would be rotated out of the original matrix. There are ways of avoiding this exception. For example, you could use a destination matrix that is large enough to hold the rotated image but then when saving, only save based on the boundaries of the original image, as illustrated in the figure below. However, first I removed the if-statement and rotated by 90 degrees, and there was only a slight improvement in the timing. Therefore, it is was not possible worth implementing a method such as this in order to eliminate this remove the if-statement as it would result in memory access exceptions.[[File:DPS915 Team7 Pixel Matrix.png|center]]
*Tried using pre-fetching by changing this statement ('''dst[r1 * maxCols + c1] = src[r * maxCols + c];''') to '''dst[r1 * maxCols + c1] = srcVal;''' and adding '''int srcVal = src[r * maxCols + c];''' before other statements. However, this did not result in any timing improvements.
100
edits