Changes

TudyBert

2,907 bytes added, 13:21, 19 April 2013

→‎Drive_God

Here's the final, optimized enlarge method:

int jdx = blockIdx.x * blockDim.x + threadIdx.x;

int idx = blockIdx.y * blockDim.y + threadIdx.y;

int k = idx + jdx * blockDim.x * gridDim.x;

int enlargeRow, enlargeCol;

__shared__ int pixel;

pixel = work[k];

enlargeRow = idx * factor;

enlargeCol = jdx * factor;

__syncthreads();

for(int c = enlargeRow; c < (enlargeRow + factor); c++)

{

for(int d = enlargeCol; d < (enlargeCol + factor); d++)

{

result[c + d * blockDim.x * gridDim.x * factor] = pixel;

__syncthreads();

}

I enjoyed parallelizing this program and really wish I could have figured out the CERN project. To make myself feel better I also parallelized the rotate image method.

Here's the final optimized code for rotating an image around its centre:

<div>

__global__ void cudaRotateImage(int *result, const int *work, int ni, int nj, float rads)

{

int r0, c0;

int r1, c1;

int jdx = blockIdx.x * blockDim.x + threadIdx.x;

int idx = blockIdx.y * blockDim.y + threadIdx.y;

int k = idx + jdx * blockDim.x * gridDim.x;

r0 = ni / 2;

c0 = nj / 2;

r1 = (int) (r0 + ((idx - r0) * cos(rads)) - ((jdx - c0) * sin(rads)));

c1 = (int) (c0 + ((idx - r0) * sin(rads)) + ((jdx - c0) * cos(rads)));

if(!(r1 >= ni || r1 < 0 || c1 >=nj || c1 < 0))

{

result[c1 * nj + r1] = work[k];

}

</div>

Rwstanica

1

edit

CDOT Wiki β

Changes

TudyBert

CDOT Wiki ^β