Open main menu

CDOT Wiki β

Changes

Three-Star

724 bytes removed, 15:13, 8 April 2018
Assignment 3
=== Assignment 3 ===
__global__ void rotateKernel(int* oldImage, int* newImage, int rows, int cols, float rads) {
'''int c''' = blockIdx.x * blockDim.x + threadIdx.x;
 
'''int r''' = blockIdx.y * blockDim.y + threadIdx.y;
 
int r0 = rows / 2;
 
int c0 = cols / 2;
 
float sinRads = sinf(rads);
 
float cosRads = cosf(rads);
 
if (r < rows && c < cols)
{
int r1 = (int)(r0 + ((r - r0) * cosRads) - ((c - c0) * sinRads));
 
int c1 = (int)(c0 + ((r - r0) * sinRads) + ((c - c0) * cosRads));
 
if (inBounds(r1, c1, rows, cols))
{
 
newImage[r1 * cols + c1] = oldImage[r * cols + c];
}
}
}
Using Coalesced Memory (changed matrix access from column to row)
|-
|512x512
| 0.54msms| 0.90msms| 85.72msms| 0.51msms| 0.89msms| 95.59msms
|-
|2x enlarged
| 1.80msms| 3.55msms| 99.66msms| 1.76msms| 3.54msms| 103.11msms
|-
|3x enlarged
| 4.65msms| 7.97msms| 111.79msms| 4.69msms| 7.95msms| 114.52msms
|-
|4x enlarged
| 8.22msms| 14.15msms| 134.32msms| 7.90msms| 14.13msms| 114.33msms
|-
|5x enlarged
| 12.89msms| 22.15msms| 128.59msms| 12.70msms| 22.09msms| 144.42msms
|-
|}
Changing the way memory is accessed doesn't seem to have any significant improvements/changes to time
122
edits