70
edits
Changes
prefetching
====== Optimization ======
After using shared memory and some constant prefetching values to perform operations in the kernel, my GPU no longer crashes on extreme operations involving millions of steps. It also outperforms my CPU running the MPI version of this application in 4 threads running at 4.9 GHz each.