Changes

GPUSquad

649 bytes added, 14:01, 7 April 2018

→‎Assignment 2

=== Assignment 2 ===

We parallelized the original code by placing the jacobi calculations into a kernel. For this initial parallel version, we only used 1D threading and had each thread run a for loop for the other dimension.

The iters loop launches a kernel for each iteration and we use double buffering (where we choose to launch the kernel with either d_a, d_b or d_b, d_a) since we can't simply swap pointers like in the serial code.

=== Assignment 3 ===

Optimization techniques used

* Get rid of the for loop in the kernel and use 2D threading within blocks

* Use gpu constant memory for jacobi calculation constants

* Utilize the ghost cell pattern for shared memory within blocks

Tsarkarcd

93

edits

Changes

GPUSquad

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

get involved with CDOT

courses

course projects

links

Tools