Changes

← Older edit

TudyBert

2,199 bytes added, 14:29, 19 April 2013

→‎Assignment 3

=== Assignment 3 ===

After making sure memory access is coalesced and replacing the second counter loop with threads from a 2 dimensional block of 2 dimensional threads, I've achieved significant speed ups in the program. All it took was launching the kernel with an optimized 2D array of blocks each containing a 2D array of threads. For assignment 2 I had a grid with 1 thread for each column in the image. That meant each thread was running 3 nested for loops to do the necessary calculations for enlarging. Figuring out the math for calculating the correct index in the arrays proved to be tricky. Although I knew exactly what to do in concept, the two extra nested for loops threw me off. For a long time the image was being enlarged correctly but the physical dimensions of the image weren't increasing. Once I had that figured out the image was enlarging but not to the new dimensions. After some tracing and trial and error I managed to find the right formula to calculate the indices. Here's the final, optimized enlarge method: int jdx = blockIdx.x * blockDim.x + threadIdx.x; int idx = blockIdx.y * blockDim.y + threadIdx.y; int k = idx + jdx * blockDim.x * gridDim.x; int enlargeRow, enlargeCol; __shared__ int pixel; pixel = work[k]; enlargeRow = idx * factor; enlargeCol = jdx * factor; __syncthreads(); for(int c = enlargeRow; c < (enlargeRow + factor); c++) { for(int d = enlargeCol; d < (enlargeCol + factor); d++) { result[c + d * blockDim.x * gridDim.x * factor] = pixel; __syncthreads(); } } I enjoyed parallelizing this program and really wish I could have figured out the CERN project. To make myself feel better I also parallelized the rotate image method. I was going to paste the code snippet here but I'm getting frustrated with the formatting. Why is it so difficult to nicely format code on a Wiki? [http://pastebin.com/ZZV9KRJN Here] it is.

Rwstanica

1

edit

Changes

TudyBert

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

get involved with CDOT

courses

course projects

links

Tools