Changes

Jump to: navigation, search

Kernal Blas

463 bytes added, 22:18, 2 April 2018
Assignment 2
In order to parallelize the code from above, we decided to use a kernel to handle the calculations.
The logic largely remains the same , but we offload the results CPU calculations to the GPU. <br>This code generates random points within the kernel and the calculations are much fasteralso done in here.<br>
<br>
Offloading to the GPU results in a pi calculation time to be reduced
The CPU's results drastically change as we increase the iteration 10x. <br>
However, the parallelized results seem to stay accurate throughout the iterations. <br>
It seems as though the calculation time doesn't change much and stays consistent. <br>Profiling the code shows that '''memcpy''' takes up most of the time spent. Even when <br>there are 10 iterations, the time remains at 300 milliseconds. <br>As the iteration passes 100 25 million, we have a bit of memory leaks leak which results in inaccurate results. <br><br> In order to optimize the code, we must find a way reduce the time memcpy takes. <br>
=== Assignment 3 ===
96
edits

Navigation menu