Open main menu

CDOT Wiki β

Changes

GPU610/Cosmosis

3,138 bytes added, 17:48, 8 February 2013
m
no edit summary
File:Dps915_neil_calls_calc.png|Self Calls
</gallery>
 
====Jesse's Profiling Findings====
Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4
 
Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4
 
I found an iterated approximation of an N-Body project on GitHub at: https://github.com/jrupac/N-body-Simulation. The simulation uses the brute force algorithm with a run-time of O(n^2). It uses SDL to draw squares which represent each body. I have removed all SDL so it only does the calculations. It also has collision detection with other bodies. It still collides with the window sides even though I have removed the window. I also included a high resolution timer by Song Ho Ahn. The timer class can be found at: http://www.cnblogs.com/JohnShao/archive/2011/10/14/2211085.html
 
'''Testing Environment:'''
 
* Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 Windows: i5 3570K @ 5Ghz
 
*Raw computations, no graphical visualization
 
*The velocity of the point is modified. Not a proper force.
 
*Brute force algoritim for calculation the forces O(N^2)
 
====Profile: 512 Samples of 500-1500 Bodies====
Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4
 
I ran a series of N-Body simulations and sampled each run 512 times. The results are exactly as you would expect for an O(n^2) algorithm; a quadratic increase in time.
 
[[File:gpu610_jsantos13_samples.png|border]]
 
====Analysis:====
Using Visual Studio 10 profiler, it is clear that the update function is an expensive call path.
 
[[File:gpu610_jsantos13_vsprofile.png|border]]
 
Although the program is an iterated approximation of an N-Body simulation, it is slower than a more proper N-Body simulation. The calculations are incorrect and the majority of them are unnecessary. The update function uses a double for loop to calculate the forces for each particle amongst one another. This is of O(n^2) runtime.
 
[[File:gpu610_jsantos13_code.png|border]]
 
====Summary:====
You can see that the majority of the processing time is used on SQ(square) and MAX(which value is bigger) calculations. The point calculation can be done independently and therefore can be parallelized with CUDA. This program can be speed up even more if we utilize the Barnes-hut algorithm for calculating N-Bodies using quad trees and spatial partitions.
=== Assignment 2 ===
=== Assignment 3 ===
1
edit