1
edit
Changes
→Calculations of Pi
For some reason the code crashes my graphic driver past 8000000 (8 million) dots, and even at 8 million it crashes most of the time. The Nvidia Visual Profiler doesn't work either, it gets stuck on generating timeline, so I used clock_t in the code instead in order to calculate execution time of the kernel. Don't think this is 100% accurate though.
'''Approach'''
Instead of doing everything within the main, I created a separate function for it. All the random number generating is done within the kernel via the Curand command. The kernel is also responsible for all the calculations and uses shared memory for all the threads within the block in order to obtain a partial sum. Here are some snippets of the code.
''' Some Code Snippets '''
If the dot is within the circle, sets the tid (threadIdx.x) index of the temp array in shared memory to 1 and sync the threads. Then sum up all the 1s in the temp array for that specific block and pass it out into another array.
[[File:Code1.JPG]]
After copying from the device to host, obtain the total sum of results from all kernels by using a for loop through all the indexes and adding the values together. This total sum is then used to calculate the value of pi.
[[File:Code2.JPG]]
'''Value of 1 Million'''
[[File:ChartMonteCarlo.JPG]]