Changes

Jump to: navigation, search

GPUSquad

649 bytes added, 13:01, 7 April 2018
Assignment 2
=== Assignment 2 ===
 
We parallelized the original code by placing the jacobi calculations into a kernel. For this initial parallel version, we only used 1D threading and had each thread run a for loop for the other dimension.
 
The iters loop launches a kernel for each iteration and we use double buffering (where we choose to launch the kernel with either d_a, d_b or d_b, d_a) since we can't simply swap pointers like in the serial code.
 
<source>
=== Assignment 3 ===
Optimization techniques used
* Get rid of the for loop in the kernel and use 2D threading within blocks
* Use gpu constant memory for jacobi calculation constants
* Utilize the ghost cell pattern for shared memory within blocks
93
edits

Navigation menu