Open main menu

CDOT Wiki β

Changes

Hu3Team

477 bytes added, 18:06, 5 December 2014
CUDA Coding
}
</pre>
We made use of shared memory to speed up the memory access for the kernel, along with coalesced memory access. We were already doing a simple reduction for getting the biggest difference, but with these tow optimizations alone we were able to get a speed up of almost 50% from the first not-optimized CUDA solution.
Because the code works getting neighboring elements from the matrix, we had to make a bigger check on the heat calculation part, to avoid illegal memory access.
====Comparing the results====