Changes

Team NP Complete

133 bytes added, 22:49, 21 December 2017

→‎Parallelized and Dynamic Performance

==Parallelized and Dynamic Performance==

~~Here~~We parallelized the code for rendering and state evolution above. We did not, however, yet parallelize the FFT code. The reason for this is that we ~~can see the performance of the program after OpenMP has been applied to several functions~~initially thought that there was a loop-carried dependency that could not be resolved:

[[File:after_fft_analysis.png|1000px|center|After FFT analysis]]

[[File:after_fft_hist.png|1000px|center|After FFT analysis]]

[[File:after_dynamic_analysis.png|1000px|center|Dynamic Performance]]

[[File:after_dynamic_histo.png|1000px|center|Dynamic Histogram]]

~~[[File:after_fft_analysis.png|1000px|center|After FFT analysis]]~~

~~[[File:after_fft_hist.png|1000px|center|After FFT analysis]]~~

=Conclusion=

After countless hours programming this simulation, incorporating OpenMP into the finished program did not prove to be very difficult. The most challenging part of including it was identifying the bottleneck for the performance, where CPU idle time was occurring. After the Fourier Transformation Function was identified as the cause of the bottleneck, external dependencies were factored out and an OpenMP for loop was added. After observing that there was still a considerable amount of CPU idle time, the program was changed to include a dynamic version of the OpenMP for loop, which indicated to the program to only create threads as it needed, as opposed to wasting time creating a set amount of threads that it may not use. After the dynamic for loop was added, the program jumped to a consistent efficiency and run time.

Jali-clarke

25

edits

Changes

Team NP Complete

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

get involved with CDOT

courses

course projects

links

Tools