
Jump to: navigation, search


3,549 bytes added, 14:29, 19 April 2013
Assignment 3
N <span style='color:#808030; '>=</span> oldImage<span style='color:#808030; '>.</span>N<span style='color:#800080; '>;</span>
M <span style='color:#808030; '>=</span> oldImage<span style='color:#808030; '>.</span>M<span style='color:#800080; '>;</span>
Q <span style='color:#808030; '>=</span> oldImage<span style='color:#808030; '>.</span>Q<span style='color:#800080; '>;</span>
<span style='color:#800000; font-weight:bold; '>if</span><span style='color:#808030; '>(</span>dim1 <span style='color:#808030; '>!</span><span style='color:#808030; '>=</span> <span style='color:#7d0045; '>NULL</span><span style='color:#808030; '>)</span>
<span style='color:#800080; '>{</span>
<span style='color:#800000; font-weight:bold; '>delete</span><span style='color:#808030; '>[</span><span style='color:#808030; '>]</span> dim1<span style='color:#800080; '>;</span>
<span style='color:#800080; '>}</span>
pixelVal <span style='color:#808030; '>=</span> <span style='color:#800000; font-weight:bold; '>new</span> <span style='color:#800000; font-weight:bold; '>int</span><span style='color:#808030; '>*</span> <span style='color:#808030; '>[</span>N<span style='color:#808030; '>]</span><span style='color:#800080; '>;</span>
dim1 <span style='color:#808030; '>=</span> <span style='color:#800000; font-weight:bold; '>new</span> <span style='color:#800000; font-weight:bold; '>int</span><span style='color:#808030; '>[</span>N<span style='color:#808030; '>*</span>M<span style='color:#808030; '>]</span><span style='color:#800080; '>;</span>
<span style='color:#800000; font-weight:bold; '>for</span><span style='color:#808030; '>(</span><span style='color:#800000; font-weight:bold; '>int</span> i <span style='color:#808030; '>=</span> <span style='color:#008c00; '>0</span><span style='color:#800080; '>;</span> i <span style='color:#808030; '>&lt;</span> N<span style='color:#800080; '>;</span> i<span style='color:#808030; '>+</span><span style='color:#808030; '>+</span><span style='color:#808030; '>)</span>
<span style='color:#800080; '>{</span>
pixelVal<span style='color:#808030; '>[</span>i<span style='color:#808030; '>]</span> <span style='color:#808030; '>=</span> <span style='color:#800000; font-weight:bold; '>new</span> <span style='color:#800000; font-weight:bold; '>int</span> <span style='color:#808030; '>[</span>M<span style='color:#808030; '>]</span><span style='color:#800080; '>;</span>
<span style='color:#800000; font-weight:bold; '>for</span><span style='color:#808030; '>(</span><span style='color:#800000; font-weight:bold; '>int</span> j <span style='color:#808030; '>=</span> <span style='color:#008c00; '>0</span><span style='color:#800080; '>;</span> j <span style='color:#808030; '>&lt;</span> M<span style='color:#800080; '>;</span> j<span style='color:#808030; '>+</span><span style='color:#808030; '>+</span><span style='color:#808030; '>)</span>
<span style='color:#800080; '>{</span>
pixelVal<span style='color:#808030; '>[</span>i<span style='color:#808030; '>]</span><span style='color:#808030; '>[</span>j<span style='color:#808030; '>]</span> <span style='color:#808030; '>=</span> oldImage<span style='color:#808030; '>.</span>pixelVal<span style='color:#808030; '>[</span>i<span style='color:#808030; '>]</span><span style='color:#808030; '>[</span>j<span style='color:#808030; '>]</span><span style='color:#800080; '>;</span>
dim1<span style='color:#808030; '>[</span>i<span style='color:#808030; '>*</span>N <span style='color:#808030; '>+</span> j<span style='color:#808030; '>]</span> <span style='color:#808030; '>=</span> oldImage<span style='color:#808030; '>.</span>dim1<span style='color:#808030; '>[</span>i<span style='color:#808030; '>*</span>N <span style='color:#808030; '>+</span> j<span style='color:#808030; '>]</span><span style='color:#800080; '>;</span>
<span style='color:#800080; '>}</span>
<span style='color:#800080; '>}</span>
<pre style='color:#000000;background:#ffffff;'><html><body style='color:#000000; background:#ffffff; '><pre>
<span style='color:#800000; font-weight:bold; '>void</span> Image<span style='color:#800080; '>::</span><span style='color:#800000; font-weight:bold; '>operator</span><span style='color:#808030; '>=</span><span style='color:#808030; '>(</span><span style='color:#800000; font-weight:bold; '>const</span> Image<span style='color:#808030; '>&amp;</span> oldImage<span style='color:#808030; '>)</span>
<span style='color:#696969; '>/*copies oldImage The chunk of the processing is wasted on copying the two arrays over from one image to another. If I have time I might look into whatever you = it parallelizing this as well. It would be interesting to see if the speed of the GPU can overcome the overhead of copying to*/</span>and from the device.
<span style=== Assignment 2 ===For Assignment 2 I simply put the four for loops into a kernel and replaced the outermost loop with thread indices. I made a helper method that set up memory on the device and launched the kernel with a 1 dimension array of blocks each containing 1 thread. I launched as many blocks of 1 thread as there were rows in the image file. I figured this was the quickest way to get this method parallelized. Unfortunately I hit a wall with my data sizes. The CPU version of the enlarge image method fails when run for more than 50 loops. The error thrown is a Visual Studio debugging error so I'm think VS isn'color:#800080; t too happy with having the CPU hogged for so long. As a result I'>{</span>ve had to extrapolate times for larger loops by assuming a linear increase in time taken.
N <span style='color:#808030; '>=</span> oldImage<span style='color:#808030; '>.</span>N<span style='color:#800080; '>;</span>
M <span style='color:#808030; '>=</span> oldImage<span style='color:#808030; '>.</span>M<span style='color:#800080; '>;</span>
Q <span style=Here'colors the code for newly parallelized method:#808030; '>=</span> oldImage<span style='color:#808030; '>.</span>Q<span style='color:#800080; '>;</span>
int idx = blockIdx.x * blockDim.x + threadIdx.x;
int enlargeRow, enlargeCol;
__shared__ int pixel;
for(int j = 0; j < nj; j++)
pixel = work[idx * nj + j];
enlargeRow = idx * factor;
enlargeCol = j * factor;
for(int c = enlargeRow; c < (enlargeRow + factor); c++)
for(int d = enlargeCol; d < (enlargeCol + factor); d++)
result[d + c * blockDim.x * gridDim.x * factor] = pixel;
While I did see a decrease in the time taken to run 50 loops, the decrease wasn't as significant as I had hoped. Obviously this kernel isn't optimized so I'm looking forward to some more impressive results as I update the code.
=== Assignment 3 ===
After making sure memory access is coalesced and replacing the second counter loop with threads from a 2 dimensional block of 2 dimensional threads, I've achieved significant speed ups in the program. All it took was launching the kernel with an optimized 2D array of blocks each containing a 2D array of threads. For assignment 2 I had a grid with 1 thread for each column in the image. That meant each thread was running 3 nested for loops to do the necessary calculations for enlarging. Figuring out the math for calculating the correct index in the arrays proved to be tricky. Although I knew exactly what to do in concept, the two extra nested for loops threw me off. For a long time the image was being enlarged correctly but the physical dimensions of the image weren't increasing. Once I had that figured out the image was enlarging but not to the new dimensions. After some tracing and trial and error I managed to find the right formula to calculate the indices.
Here's the final, optimized enlarge method: <span style='color:#8000007f0055; font-weight:bold; '>if</span><span style='color:#808030; '>(</span>dim1 <span style='color:#808030; '>!</span><span style='color:#808030; '>=</span> <span style='color:#7d0045; '>NULLint</span><span stylejdx ='color:#808030blockIdx.x * blockDim.x + threadIdx.x; '>)</span>
<span style='color:#8000807f0055; font-weight:bold; '>{int</span>idx = blockIdx.y * blockDim.y + threadIdx.y;
<span style='color:#8000007f0055; font-weight:bold; '>deleteint</span><span style='color:#808030; '>[</span><span style='color:#808030; '>]</span> dim1<span stylek ='color:#800080idx + jdx * blockDim.x * gridDim.x; '>;</span>
<span style='color:#8000807f0055; font-weight:bold; '>}int</span>enlargeRow, enlargeCol;
__shared__ <span style='color:#7f0055; font-weight:bold; '>int</span> pixel;
pixel = work[k];
pixelVal <span style enlargeRow ='color:#808030; '>=</span> <span style='color:#800000; font-weight:bold; '>new</span> <span style='color:#800000; font-weight:bold; '>int</span><span style='color:#808030; '>idx *</span> <span style='color:#808030; '>[</span>N<span style='color:#808030; '>]</span><span style='color:#800080; '>factor;</span>
dim1 <span style enlargeCol ='color:#808030; '>=</span> <span style='color:#800000; font-weight:bold; '>new</span> <span style='color:#800000; font-weight:bold; '>int</span><span style='color:#808030; '>[</span>N<span style='color:#808030; '>jdx *</span>M<span style='color:#808030; '>]</span><span style='color:#800080; '>factor;</span>
<span style='color:#7f0055; font-weight:bold; '>for</span>(<span style='color:#7f0055; font-weight:bold; '>int</span> c = enlargeRow; c &lt; (enlargeRow + factor); c++)
<span style='color:#800000; font-weight:bold; '>for</span><span style='color:#808030; '>(</span><span style='color:#800000; font-weight:bold; '>int</span> i <span style='color:#808030; '>=</span> <span style='color:#008c00; '>0</span><span style='color:#800080; '>;</span> i <span style='color:#808030; '>&lt;</span> N<span style='color:#800080; '>;</span> i<span style='color:#808030; '>+</span><span style='color:#808030; '>+</span><span style='color:#808030; '>)</span> {
<span style='color:#8000807f0055; font-weight:bold; '>{for</span>(<span style='color:#7f0055; font-weight:bold; '>int</span> d = enlargeCol; d &lt; (enlargeCol + factor); d++)
pixelVal<span style='color:#808030; '>[</span>i<span style='color:#808030; '>]</span> <span style='color:#808030; '>=</span> <span style='color:#800000; font-weight:bold; '>new</span> <span style='color:#800000; font-weight:bold; '>int</span> <span style='color:#808030; '>[</span>M<span style='color:#808030; '>]</span><span style='color:#800080; '>;</span> {
<span style='color:#800000; font-weight:bold; '>for</span><span style='color:#808030; '>(</span><span style='color:#800000; font-weight:bold; '>int</span> j <span style='color:#808030; '>=</span> <span style='color:#008c00; '>0</span><span style='color:#800080; '>;</span> j <span style='color:#808030; '>&lt;</span> M<span style='color:#800080; '>;</span> j<span style='color:#808030; '>+</span><span style='color:#808030; '> result[c +</span><span styled * blockDim.x * gridDim.x * factor] ='color:#808030pixel; '>)</span>
<span style='color:#800080 __syncthreads(); '>{</span>
pixelVal<span style='color:#808030; '>[</span>i<span style='color:#808030; '>]</span><span style='color:#808030; '>[</span>j<span style='color:#808030; '>]</span> <span style='color:#808030; '>=</span> oldImage<span style='color:#808030; '>.</span>pixelVal<span style='color:#808030; '>[</span>i<span style='color:#808030; '>]</span><span style='color:#808030; '>[</span>j<span style='color:#808030; '>]</span><span style='color:#800080; '>;</span> }
dim1<span style='color:#808030; '>[</span>i<span style='color:#808030; '>*</span>N <span style='color:#808030; '>+</span> j<span style='color:#808030; '>]</span> <span style='color:#808030; '>=</span> oldImage<span style='color:#808030; '>.</span>dim1<span style='color:#808030; '>[</span>i<span style='color:#808030; '>*</span>N <span style='color:#808030; '>+</span> j<span style='color:#808030; '>]</span><span style='color:#800080; '>;</span> }
<span style='color:#800080; '>}</span>
<span style='color:#800080; '>}</span>I enjoyed parallelizing this program and really wish I could have figured out the CERN project. To make myself feel better I also parallelized the rotate image method.
<span style='color:#800080; '>}</span>
<I was going to paste the code snippet here but I'm getting frustrated with the formatting. Why is it so difficult to nicely format code on a Wiki? [http:/div>=== Assignment 3 ===/ Here] it is.

Navigation menu