1
edit
Changes
→Assignment 2
=== Assignment 2 ===
==== Colin's Report ====
For assignment 2 we chose the Heat Equation problem. I profiled both the serial and CUDA versions of the code by taking the average of 25 steps in milliseconds.
The tests were run on a laptop with a GeForce 650 GPU. Due to memory constraints the maximum size of the matrix that could be run was 15000x15000.
I've created a chart comparing the runtimes
[[Image:GPUA2Colin.png|400px]]
====== Conclusions ======
There were no major issues converting the code to CUDA as it's a simple matrix, which made it very straightforward. When I first converted the code I did however notice that it was running slower than the CPU version. This was caused by inefficient block sizing. I managed to fix it by modifying the number of threads per block until it made more efficient use of the CUDA cores. In the end, without any other optimizations it runs at around twice the speed of the CPU code.
=== Assignment 3 ===