Changes

Jump to: navigation, search

GPU610/SSD

29 bytes added, 11:52, 9 February 2013
Assignment 1
Sezar:
I decided to profile [http://home.web.cern.ch/ CERN ] project - Drive_God_Lin.
After gprof the project with the test data provided I learned the following:
(summery of gprof)
==  Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls ms/call ms/call name
37.42 51.65 24.97 314400 0.08 0.08 zfunr_
12.69 60.12 8.47 314400 0.03 0.03 cfft_
...== 
As you can see, most the program's time is spent in the ordres() subroutine and zfunr() subroutine; both are localed in the fortran portion of the program (there is a c portion as well).
Furthermore, this program is already parallelized using openmp; which means it may be farther parallelized using cuda technology.
 
In these two subroutines (specially in orders()) there are a lot of nested loops, if statements, goto's, and even some sore of a search algorithm (in orders() only) that I'm certain it could be notably improved using cuda technology.
1
edit

Navigation menu