|
|
(42 intermediate revisions by 3 users not shown) |
Line 1: |
Line 1: |
− | {{GPU610/DPS915 Index | 20181}}
| |
− | = Team 7 =
| |
− | == Team Members ==
| |
− | # [mailto:aminassian@myseneca.ca?subject=dps915 Alek Minassian]
| |
− | # [mailto:achowdhury17@myseneca.ca?subject=dps915 Ariquddowla Chowdhury]
| |
− | # [mailto:ayeung24@myseneca.ca?subject=dps915 Alfred Yeung]
| |
− | [mailto:aminassian@myseneca.ca;achowdhury17@myseneca.ca;ayeung24@myseneca.ca Email All]
| |
| | | |
− | == Assignment 1 ==
| |
− | === Creating ray-traced images by Alek Minassian ===
| |
− | This is an open-source program available from [http://cosinekitty.com/raytrace/chapter05_cpp_code.html here]. This program can be built and run on both Windows and Linux. A copy of the source code in ZIP format is available in the section below as well as instructions for building and running. The program works by creating one or more objects and placing them on a "Scene". It then adds one or more light sources and traces them as they reflect off or refract through objects. As an example, the following chessboard is generated as follows:
| |
− | *A chessboard is created, rotated, and added to the scene.
| |
− | *Three spheres are created and added to the scene.
| |
− | *Three light sources are added.
| |
− | *Finally, the new image is saved.
| |
− |
| |
− | [[File:Chessboard.png]]
| |
− |
| |
− | ==== Downloading the source code ====
| |
− | A modified version of the project is available from [[Media:Raytrace.a1.zip | here]]. The following modifications were made:
| |
− | *The raytrace/raytrace/build file used for building on Linux is modified to enable profiling as well as using version 7.2.0 of g++ on Matrix.
| |
− | *The Visual Studio solution raytrace/raytrace.sln is upgraded to Visual Studio 2017.
| |
− | *Some code has been added to scene.cpp in order to report the time spent in certain parts of the code. This is so that we can get a more granular timing than what is available through profiling.
| |
− |
| |
− | ==== Building and running on Windows ====
| |
− | *Unzip the source from the previous step and open raytrace/raytrace.sln.
| |
− | *Set the active configuration and platform:
| |
− | **From the "Build" menu item, select "Configuration Manager..."
| |
− | **Change the Active solution platform to be x64.
| |
− | **Change the Active solution configuration to be Release.
| |
− | *If you run into build issues, you may have to re-target the solution as follows:
| |
− | **In the Solution Explorer pane, right-click on the solution and select "Retarget solution".
| |
− | **Select the latest Windows SDK version available to you.
| |
− | *You can now build the solution from the build menu.
| |
− | *Run the program by selecting "Start Without Debugging" from the Debug menu. You should now see the following screen:
| |
− | [[File:AM ReportTime.png|500px]]
| |
− | *The chessboard image chessboard.png, that was displayed above, is now generated and can be found in the raytrace/raytrace folder.
| |
− |
| |
− | ==== Building and running on Linux (Matrix) ====
| |
− | *Download the source: ''curl <nowiki>https://wiki.cdot.senecacollege.ca/w/imgs/Raytrace.a1.zip</nowiki> -o rtsource.zip''
| |
− | *Unzip the source: ''unzip rtsource.zip''
| |
− | *''cd raytrace/raytrace''
| |
− | *''chmod a+x build''
| |
− | *Build the source code: ''./build''
| |
− | *Run the application: ''./raytrace chessboard''
| |
− |
| |
− | ==== Profiling and Analysis ====
| |
− | The following shows the results of profiling in Visual Studio:
| |
− |
| |
− | [[File:AM_Profile.PNG]]
| |
− |
| |
− |
| |
− | '''Profiling on Matrix'''
| |
− |
| |
− | The results of the profiling done on Matrix can be found in this ''[[Media:AM raytrace.flt.txt | flat profile]]'' and this ''[[Media:AM raytrace.clg.txt | call graph]]''.
| |
− |
| |
− |
| |
− | The following shows some timings reported by code that was added to the application.
| |
− |
| |
− | [[File:AM_ReportTime.png|500px]]
| |
− |
| |
− | Analysis of the code shows that ''main'' calls ''ChessBoardTest'' which calls ''SaveImage''. Based on the profiling results above, the application spends the majority of its time in ''SaveImage''. Analyzing the code also shows that ''SaveImage'' calls ''TraceRay'' in a nested loop as follows:
| |
− | <code>
| |
− | <nowiki>
| |
− | for (size_t i=0; i < largePixelsWide; ++i)
| |
− | {
| |
− | ......
| |
− |
| |
− | for (size_t j=0; j < largePixelsHigh; ++j)
| |
− | {
| |
− | ......
| |
− | TraceRay();
| |
− | ......
| |
− | }
| |
− | }
| |
− | </nowiki>
| |
− | </code>
| |
− |
| |
− | The remainder of the functions where the majority of the time is spent are called from ''TraceRay''. The timing statements added to the code show that 3261 milliseconds are spent in this nested loop. The total time spent in the application is 3658 milliseconds. Therefore, we can conclude that the majority of the time is spent in the above nested loop. Since one iteration of the loop does not depend on another iteration, the calls to ''TraceRay'' can be parallelized.
| |
− |
| |
− | '''Sudoku Solver'''
| |
− |
| |
− |
| |
− | This is an open source project that I found on someone's github page which can be found [https://github.com/bryanesmith/Sudoku-solver here]. This program can be compiled with the GNU C++ compiler.
| |
− |
| |
− | The program works by first defining what the sudoku board looks like. It sets each value. It checks a value and makes sure it fits based on Sudoku rules. Everytime a value is set, we backtrack to ensure that the rules are kept across the board.
| |
− |
| |
− | The main chunk of code that seemily would run the longest would be in the verifyValue function.
| |
− |
| |
− | <code>
| |
− | for (int y_verify=box_y * 3; y_verify < box_y * 3 + 3; y_verify++) {
| |
− | // For each x in the same box
| |
− | for (int x_verify=box_x * 3; x_verify < box_x * 3 + 3; x_verify++) {
| |
− | // Skip self.
| |
− | if (x_verify == x_cord && y_verify == y_cord) {
| |
− | continue;
| |
− | }
| |
− |
| |
− | // If same value, failed
| |
− | int verifyValue = board[x_verify][y_verify];
| |
− | if (verifyValue == value) {
| |
− | return false;
| |
− | }
| |
− | }
| |
− | }
| |
− |
| |
− | </code>
| |
− |
| |
− | This part runs at O(xy) time complexity.
| |
− |
| |
− | '''Profiling and Call Graph'''
| |
− |
| |
− | After further analysis, the initial solution is already fast.
| |
− |
| |
− | '''Flat profile:'''
| |
− | Each sample counts as 0.01 seconds.
| |
− | no time accumulated
| |
− |
| |
− | % cumulative self self total
| |
− | time seconds seconds calls Ts/call Ts/call name
| |
− | 0.00 0.00 0.00 44317 0.00 0.00 SudokuPuzzle::verifyValue(int, int)
| |
− | 0.00 0.00 0.00 1 0.00 0.00 _GLOBAL__sub_I__ZN12SudokuPuzzleC2Ev
| |
− | 0.00 0.00 0.00 1 0.00 0.00 _GLOBAL__sub_I_main
| |
− | 0.00 0.00 0.00 1 0.00 0.00 SudokuPuzzle::solve(int, int)
| |
− |
| |
− |
| |
− |
| |
− |
| |
− | '''Call graph'''
| |
− |
| |
− | granularity: each sample hit covers 4 byte(s) no time propagated
| |
− |
| |
− | index % time self children called name
| |
− | 0.00 0.00 44317/44317 SudokuPuzzle::solve(int, int) [10]
| |
− | [7] 0.0 0.00 0.00 44317 SudokuPuzzle::verifyValue(int, int) [7]
| |
− | -----------------------------------------------
| |
− | 0.00 0.00 1/1 __do_global_ctors_aux [18]
| |
− | [8] 0.0 0.00 0.00 1 _GLOBAL__sub_I__ZN12SudokuPuzzleC2Ev [8]
| |
− | -----------------------------------------------
| |
− | 0.00 0.00 1/1 __do_global_ctors_aux [18]
| |
− | [9] 0.0 0.00 0.00 1 _GLOBAL__sub_I_main [9]
| |
− | -----------------------------------------------
| |
− | 4701 SudokuPuzzle::solve(int, int) [10]
| |
− | 0.00 0.00 1/1 SudokuPuzzle::solve() [15]
| |
− | [10] 0.0 0.00 0.00 1+4701 SudokuPuzzle::solve(int, int) [10]
| |
− | 0.00 0.00 44317/44317 SudokuPuzzle::verifyValue(int, int) [7]
| |
− | 4701 SudokuPuzzle::solve(int, int) [10]
| |
− | -----------------------------------------------
| |
− |
| |
− |
| |
− | Index by function name
| |
− |
| |
− | [8] _GLOBAL__sub_I__ZN12SudokuPuzzleC2Ev (SudokuPuzzle.cpp) [7] SudokuPuzzle::verifyValue(int, int)
| |
− | [9] _GLOBAL__sub_I_main (main.cpp) [10] SudokuPuzzle::solve(int, int)
| |
− |
| |
− |
| |
− | Assessment:
| |
− |
| |
− |
| |
− | This code would not benefit from parallelism as it is already fast, and each result relies on a previous result. This would make the code incredibly complex to parallelize and
| |
− | it would not benefit as such. Perhaps if the Sudoku board was larger than 9x9 the solution could be faster.
| |
− |
| |
− | == Assignment 2 ==
| |
− | == Assignment 3 ==
| |