1
edit
Changes
→Conclusion
==== '''Conclusion''' ====
Using CUDA technology and parallelizing the serial code in the original code, there is an enormous increase in performance (lower execution time) to calculate , as high as '''1372%'''. In the next (final) phase, an attempt to investigate if shared memory, optimal memory allocation, minimizing said memory access time, and other optimization factors would provide a further increase (lower execution time) in performance for ''pi_cuda''.
=== '''Assignment 3''' ===