Changes

GPUSquad

421 bytes removed, 13:01, 7 April 2018

→‎Assignment 3

=== Assignment 2 ===

We parallelized the original code by placing the jacobi calculations into a kernel. For this initial parallel version, we only used 1D threading and had each thread run a for loop for the other dimension.

The iters loop launches a kernel for each iteration and we use double buffering (where we choose to launch the kernel with either d_a, d_b or d_b, d_a) since we can't simply swap pointers like in the serial code.

Tsarkarcd

93

edits

CDOT Wiki β

Changes

GPUSquad

CDOT Wiki ^β