Changes

Jump to: navigation, search

TudyBert

2,907 bytes added, 13:21, 19 April 2013
Drive_God
Here's the final, optimized enlarge method:
 
<span style='color:#7f0055; font-weight:bold; '>int</span> jdx = blockIdx.x * blockDim.x + threadIdx.x;
 
<span style='color:#7f0055; font-weight:bold; '>int</span> idx = blockIdx.y * blockDim.y + threadIdx.y;
 
<span style='color:#7f0055; font-weight:bold; '>int</span> k = idx + jdx * blockDim.x * gridDim.x;
 
<span style='color:#7f0055; font-weight:bold; '>int</span> enlargeRow, enlargeCol;
 
__shared__ <span style='color:#7f0055; font-weight:bold; '>int</span> pixel;
 
pixel = work[k];
 
enlargeRow = idx * factor;
 
enlargeCol = jdx * factor;
 
__syncthreads();
 
<span style='color:#7f0055; font-weight:bold; '>for</span>(<span style='color:#7f0055; font-weight:bold; '>int</span> c = enlargeRow; c &lt; (enlargeRow + factor); c++)
 
{
 
<span style='color:#7f0055; font-weight:bold; '>for</span>(<span style='color:#7f0055; font-weight:bold; '>int</span> d = enlargeCol; d &lt; (enlargeCol + factor); d++)
 
{
 
result[c + d * blockDim.x * gridDim.x * factor] = pixel;
 
__syncthreads();
 
}
 
}
 
I enjoyed parallelizing this program and really wish I could have figured out the CERN project. To make myself feel better I also parallelized the rotate image method.
 
Here's the final optimized code for rotating an image around its centre:
 
 
<div>
__global__ <span style='color:#7f0055; font-weight:bold; '>void</span> cudaRotateImage(<span style='color:#7f0055; font-weight:bold; '>int</span> *result, <span style='color:#7f0055; font-weight:bold; '>const</span> <span style='color:#7f0055; font-weight:bold; '>int</span> *work, <span style='color:#7f0055; font-weight:bold; '>int</span> ni, <span style='color:#7f0055; font-weight:bold; '>int</span> nj, <span style='color:#7f0055; font-weight:bold; '>float</span> rads)
 
{
 
<span style='color:#7f0055; font-weight:bold; '>int</span> r0, c0;
 
<span style='color:#7f0055; font-weight:bold; '>int</span> r1, c1;
 
<span style='color:#7f0055; font-weight:bold; '>int</span> jdx = blockIdx.x * blockDim.x + threadIdx.x;
 
<span style='color:#7f0055; font-weight:bold; '>int</span> idx = blockIdx.y * blockDim.y + threadIdx.y;
 
<span style='color:#7f0055; font-weight:bold; '>int</span> k = idx + jdx * blockDim.x * gridDim.x;
 
r0 = ni / 2;
 
c0 = nj / 2;
 
 
 
r1 = (<span style='color:#7f0055; font-weight:bold; '>int</span>) (r0 + ((idx - r0) * <span style='color:#7f0055; font-weight:bold; '>cos</span>(rads)) - ((jdx - c0) * <span style='color:#7f0055; font-weight:bold; '>sin</span>(rads)));
 
c1 = (<span style='color:#7f0055; font-weight:bold; '>int</span>) (c0 + ((idx - r0) * <span style='color:#7f0055; font-weight:bold; '>sin</span>(rads)) + ((jdx - c0) * <span style='color:#7f0055; font-weight:bold; '>cos</span>(rads)));
 
 
<span style='color:#7f0055; font-weight:bold; '>if</span>(!(r1 >= ni || r1 &lt; 0 || c1 >=nj || c1 &lt; 0))
 
{
 
result[c1 * nj + r1] = work[k];
 
}
 
 
}
</div>
1
edit

Navigation menu