Changes

Jump to: navigation, search

BetaT

170 bytes added, 00:53, 12 April 2017
Solution to Windows Display Driver Crashing
This reduced the number of threads firing in the kernel. In the Calculate Kernel which is below you can see the old one had all the threads from the ( y dimension) sitting idle doing nothing except slowing down the execution.
==== PARALLELIZED CALCULATE WAVE KERNEL ====
__global__ void Calculate (float* u, float* un,int nx, int c, float dx, float dt)
{
}
==== OPTIMIZED CALCULATE WAVE KERNEL ====
The code below has been altered to remove the (j) variable and combined the two (if) statements into one, so that we can reduce (Thread Divergence), as well as move the (- c*dt/dx* ) recurring instruction set, and place it into a variable called total, so that each thread is NOT performing the same operation which causes a decrease in performance.
With this optimized code it is now possible to execute with a problem size > 2000 & 2000.
==== ORIGINAL INITIALIZATION KERNEL ====
The Initialize kernel has also been redesigned. Below is the original:
}
==== OPTIMIZED INITIALIZATION KERNEK ====
I removed the variable (j), removed the syncthreads() which were not needed, I also removed the function running on the CPU that initializes all indexes int he arrays to 0, and moved it into the GPU below.
212
edits

Navigation menu