70
edits
Changes
coalescing
After using shared memory and prefetching values to perform operations in the kernel, my GPU no longer crashes on extreme operations involving millions of steps. It also outperforms my CPU running the MPI version of this application in 4 threads running at 4.9 GHz each.
Since my video card has 48 KB of shared memory and I am not using more than 20 KB with all of my arrays, I do not need to worry about coalescing my data, since shared memory is much faster.