Difference between revisions of "Unique Project Page"

From CDOT Wiki
Jump to: navigation, search
(Blanked the page)
Line 1: Line 1:
  
 +
== Introduction : GPU Benchmarking/Testing using Mandelbrot Sets : Kartik Nagarajan ==
 +
 +
 +
This program generates Mandelbrot sets using CPU's and then saves them to the folder as png's using the freeimage library.
 +
 +
The program is open-source and can be fetched directly from GitHub from https://github.com/sol-prog/Mandelbrot_Set
 +
 +
To compile the program, FreeImage is required to be installed.
 +
 +
 +
=== Compilation Instructions: ===
 +
 +
For Unix based systems:
 +
g++ -std=c++11 save_image.cpp utils.cpp mandel.cpp -lfreeimage
 +
 +
OSX:
 +
clang++ -std=c++11 save_image.cpp utils.cpp mandel.cpp -lfreeimage
 +
 +
The program can then be executed by running the compiled binary and it will display the time it took to generate the Mandelbrot set and save the pictures.
 +
 +
 +
=== Observations ===
 +
 +
The program takes a significant amount of time to run as the calculations are being done on the CPU. There are nested loops present within the program that can be parallelized to make the program faster.
 +
 +
The code also has the size of the image and the iterations hard-coded which can be modified to make the program significantly longer to process and make it tough on the GPU's for benchmarking and stability testing by running the process in a loop. The code is relatively straight forward and the parallelization should also be easy to implement and test.
 +
 +
 +
=== Hotspot ===
 +
 +
Hotspot for the program was found in the fractal() function which calls the get_iterations() function that contains 2-nested for loops and a call to escape() which contains a while loop. Profiling the runtime with Instruments on OSX displayed that the fractal() function took up the most amount of runtime and this is the function that will be parallelized using CUDA. Once the function is parallelized, the iterations and size of the image can be increased in order to make the computation relatively stressful on the GPU to get a benchmark or looped in order to do stress testing for GPUs.
 +
 +
 +
=== Profiling Data Screenshots ===
 +
 +
Profile - [https://drive.google.com/open?id=0B2Y_atB3DptbUG5oRWMyUGNQdlU  Profile]
 +
 +
Hotspot Code - [https://drive.google.com/open?id=0B2Y_atB3DptbRlhCUTNyeEFDbEk Hotspot Code]
 +
 +
----

Revision as of 22:35, 23 February 2017

Introduction : GPU Benchmarking/Testing using Mandelbrot Sets : Kartik Nagarajan

This program generates Mandelbrot sets using CPU's and then saves them to the folder as png's using the freeimage library.

The program is open-source and can be fetched directly from GitHub from https://github.com/sol-prog/Mandelbrot_Set

To compile the program, FreeImage is required to be installed.


Compilation Instructions:

For Unix based systems: g++ -std=c++11 save_image.cpp utils.cpp mandel.cpp -lfreeimage

OSX: clang++ -std=c++11 save_image.cpp utils.cpp mandel.cpp -lfreeimage

The program can then be executed by running the compiled binary and it will display the time it took to generate the Mandelbrot set and save the pictures.


Observations

The program takes a significant amount of time to run as the calculations are being done on the CPU. There are nested loops present within the program that can be parallelized to make the program faster.

The code also has the size of the image and the iterations hard-coded which can be modified to make the program significantly longer to process and make it tough on the GPU's for benchmarking and stability testing by running the process in a loop. The code is relatively straight forward and the parallelization should also be easy to implement and test.


Hotspot

Hotspot for the program was found in the fractal() function which calls the get_iterations() function that contains 2-nested for loops and a call to escape() which contains a while loop. Profiling the runtime with Instruments on OSX displayed that the fractal() function took up the most amount of runtime and this is the function that will be parallelized using CUDA. Once the function is parallelized, the iterations and size of the image can be increased in order to make the computation relatively stressful on the GPU to get a benchmark or looped in order to do stress testing for GPUs.


Profiling Data Screenshots

Profile - Profile

Hotspot Code - Hotspot Code