Changes

Jump to: navigation, search

GPU610/DPS915 Team 7 Project Page

12,709 bytes removed, 19:00, 23 April 2018
Blanked the page
{{GPU610/DPS915 Index | 20181}}
= Team 7 =
== Team Members ==
# [mailto:aminassian@myseneca.ca?subject=dps915 Alek Minassian]
# [mailto:achowdhury17@myseneca.ca?subject=dps915 Ariquddowla Chowdhury]
# [mailto:ayeung24@myseneca.ca?subject=dps915 Alfred Yeung]
[mailto:aminassian@myseneca.ca;achowdhury17@myseneca.ca;ayeung24@myseneca.ca Email All]
== Assignment 1 ==
=== Creating ray-traced images by Alek Minassian ===
This is an open-source program available from [http://cosinekitty.com/raytrace/chapter05_cpp_code.html here]. This program can be built and run on both Windows and Linux. A copy of the source code in ZIP format is available in the section below as well as instructions for building and running. The program works by creating one or more objects and placing them on a "Scene". It then adds one or more light sources and traces them as they reflect off or refract through objects. As an example, the following chessboard is generated as follows:
*A chessboard is created, rotated, and added to the scene.
*Three spheres are created and added to the scene.
*Three light sources are added.
*Finally, the new image is saved.
 
[[File:Chessboard.png]]
 
==== Downloading the source code ====
A modified version of the project is available from [[Media:Raytrace.a1.zip | here]]. The following modifications were made:
*The raytrace/raytrace/build file used for building on Linux is modified to enable profiling as well as using version 7.2.0 of g++ on Matrix.
*The Visual Studio solution raytrace/raytrace.sln is upgraded to Visual Studio 2017.
*Some code has been added to scene.cpp in order to report the time spent in certain parts of the code. This is so that we can get a more granular timing than what is available through profiling.
 
==== Building and running on Windows ====
*Unzip the source from the previous step and open raytrace/raytrace.sln.
*Set the active configuration and platform:
**From the "Build" menu item, select "Configuration Manager..."
**Change the Active solution platform to be x64.
**Change the Active solution configuration to be Release.
*If you run into build issues, you may have to re-target the solution as follows:
**In the Solution Explorer pane, right-click on the solution and select "Retarget solution".
**Select the latest Windows SDK version available to you.
*You can now build the solution from the build menu.
*Run the program by selecting "Start Without Debugging" from the Debug menu. You should now see the following screen:
[[File:AM ReportTime.png|500px]]
*The chessboard image chessboard.png, that was displayed above, is now generated and can be found in the raytrace/raytrace folder.
 
==== Building and running on Linux (Matrix) ====
*Download the source: ''curl <nowiki>https://wiki.cdot.senecacollege.ca/w/imgs/Raytrace.a1.zip</nowiki> -o rtsource.zip''
*Unzip the source: ''unzip rtsource.zip''
*''cd raytrace/raytrace''
*''chmod a+x build''
*Build the source code: ''./build''
*Run the application: ''./raytrace chessboard''
 
==== Profiling and Analysis ====
The following shows the results of profiling in Visual Studio:
 
[[File:AM_Profile.PNG]]
 
 
'''Profiling on Matrix'''
 
The results of the profiling done on Matrix can be found in this ''[[Media:AM raytrace.flt.txt | flat profile]]'' and this ''[[Media:AM raytrace.clg.txt | call graph]]''.
 
 
The following shows some timings reported by code that was added to the application.
 
[[File:AM_ReportTime.png|500px]]
 
Analysis of the code shows that ''main'' calls ''ChessBoardTest'' which calls ''SaveImage''. Based on the profiling results above, the application spends the majority of its time in ''SaveImage''. Analyzing the code also shows that ''SaveImage'' calls ''TraceRay'' in a nested loop as follows:
<code>
<nowiki>
for (size_t i=0; i < largePixelsWide; ++i)
{
......
 
for (size_t j=0; j < largePixelsHigh; ++j)
{
......
TraceRay();
......
}
}
</nowiki>
</code>
 
The remainder of the functions where the majority of the time is spent are called from ''TraceRay''. The timing statements added to the code show that 3261 milliseconds are spent in this nested loop. The total time spent in the application is 3658 milliseconds. Therefore, we can conclude that the majority of the time is spent in the above nested loop. Since one iteration of the loop does not depend on another iteration, the calls to ''TraceRay'' can be parallelized.
 
----
 
===Sudoku Solver by Ariquddowla Chowdhury===
 
 
This is an open source project that I found on someone's github page which can be found [https://github.com/bryanesmith/Sudoku-solver here]. This program can be compiled with the GNU C++ compiler.
 
The program works by first defining what the sudoku board looks like. It sets each value. It checks a value and makes sure it fits based on Sudoku rules. Everytime a value is set, we backtrack to ensure that the rules are kept across the board.
 
The main chunk of code that seemily would run the longest would be in the verifyValue function.
 
<code>
<nowiki>
for (int y_verify=box_y * 3; y_verify < box_y * 3 + 3; y_verify++) {
// For each x in the same box
for (int x_verify=box_x * 3; x_verify < box_x * 3 + 3; x_verify++) {
// Skip self.
if (x_verify == x_cord && y_verify == y_cord) {
continue;
}
// If same value, failed
int verifyValue = board[x_verify][y_verify];
if (verifyValue == value) {
return false;
}
}
}
</nowiki>
</code>
 
This part runs at O(xy) time complexity.
 
====Profiling and Call Graph====
 
After further analysis, the initial solution is already fast.
 
'''Flat profile:'''
Each sample counts as 0.01 seconds.
no time accumulated
 
% cumulative self self total
time seconds seconds calls Ts/call Ts/call name
0.00 0.00 0.00 44317 0.00 0.00 SudokuPuzzle::verifyValue(int, int)
0.00 0.00 0.00 1 0.00 0.00 _GLOBAL__sub_I__ZN12SudokuPuzzleC2Ev
0.00 0.00 0.00 1 0.00 0.00 _GLOBAL__sub_I_main
0.00 0.00 0.00 1 0.00 0.00 SudokuPuzzle::solve(int, int)
 
 
 
 
'''Call graph'''
 
granularity: each sample hit covers 4 byte(s) no time propagated
 
index % time self children called name
0.00 0.00 44317/44317 SudokuPuzzle::solve(int, int) [10]
[7] 0.0 0.00 0.00 44317 SudokuPuzzle::verifyValue(int, int) [7]
-----------------------------------------------
0.00 0.00 1/1 __do_global_ctors_aux [18]
[8] 0.0 0.00 0.00 1 _GLOBAL__sub_I__ZN12SudokuPuzzleC2Ev [8]
-----------------------------------------------
0.00 0.00 1/1 __do_global_ctors_aux [18]
[9] 0.0 0.00 0.00 1 _GLOBAL__sub_I_main [9]
-----------------------------------------------
4701 SudokuPuzzle::solve(int, int) [10]
0.00 0.00 1/1 SudokuPuzzle::solve() [15]
[10] 0.0 0.00 0.00 1+4701 SudokuPuzzle::solve(int, int) [10]
0.00 0.00 44317/44317 SudokuPuzzle::verifyValue(int, int) [7]
4701 SudokuPuzzle::solve(int, int) [10]
-----------------------------------------------
 
 
Index by function name
 
[8] _GLOBAL__sub_I__ZN12SudokuPuzzleC2Ev (SudokuPuzzle.cpp) [7] SudokuPuzzle::verifyValue(int, int)
[9] _GLOBAL__sub_I_main (main.cpp) [10] SudokuPuzzle::solve(int, int)
 
 
====Assessment====
 
 
This code would not benefit from parallelism as it is already fast, and each result relies on a previous result. This would make the code incredibly complex to parallelize and
it would not benefit as such. Perhaps if the Sudoku board was larger than 9x9 the solution could be faster.
 
----
 
=== Image Processing by Alfred Yeung ===
 
I found a sample of image processing code located here: http://www.dreamincode.net/forums/topic/76816-image-processing-tutorial/.
 
The code uses PGM files (P5 type is the standard for the code). For more information about PGM files, here is a link: http://netpbm.sourceforge.net/doc/pgm.html.
 
There were a few functions that I felt could be parallelized a significant amount. These functions were enlargeImage, reflectImage, and rotateImage.
==== Functions ====
===== Enlarge Image =====
'''
int rows, cols, gray;
int pixel;
int enlargeRow, enlargeCol;
rows = oldImage.N * value;
cols = oldImage.M * value;
gray = oldImage.Q;
Image tempImage(rows, cols, gray);
for(int i = 0; i < oldImage.N; i++)
{
for(int j = 0; j < oldImage.M; j++)
{
pixel = oldImage.pixelVal[i][j];
enlargeRow = i * value;
enlargeCol = j * value;
for(int c = enlargeRow; c < (enlargeRow + value); c++)
{
for(int d = enlargeCol; d < (enlargeCol + value); d++)
{
tempImage.pixelVal[c][d] = pixel;
}
}
}
}
oldImage = tempImage;
'''
===== Reflect Image =====
'''
int rows = oldImage.N;
int cols = oldImage.M;
Image tempImage(oldImage);
if(flag == true) //horizontal reflection
{
for(int i = 0; i < rows; i++)
{
for(int j = 0; j < cols; j++)
tempImage.pixelVal[rows - (i + 1)][j] = oldImage.pixelVal[i][j];
}
}
else //vertical reflection
{
for(int i = 0; i < rows; i++)
{
for(int j = 0; j < cols; j++)
tempImage.pixelVal[i][cols - (j + 1)] = oldImage.pixelVal[i][j];
}
}
oldImage = tempImage;
'''
===== Rotate Image =====
'''
int r0, c0;
int r1, c1;
int rows, cols;
rows = oldImage.N;
cols = oldImage.M;
Image tempImage(rows, cols, oldImage.Q);
float rads = (theta * 3.14159265)/180.0;
r0 = rows / 2;
c0 = cols / 2;
for(int r = 0; r < rows; r++)
{
for(int c = 0; c < cols; c++)
{
r1 = (int) (r0 + ((r - r0) * cos(rads)) - ((c - c0) * sin(rads)));
c1 = (int) (c0 + ((r - r0) * sin(rads)) + ((c - c0) * cos(rads)));
if(inBounds(r1,c1))
{
tempImage.pixelVal[r1][c1] = oldImage.pixelVal[r][c];
}
}
}
for(int i = 0; i < rows; i++)
{
for(int j = 0; j < cols; j++)
{
if(tempImage.pixelVal[i][j] == 0)
tempImage.pixelVal[i][j] = tempImage.pixelVal[i][j+1];
}
}
oldImage = tempImage;
'''
==== Profiling ====
 
I started with a PGM file that contained 512 width and height (in ASCII decimal), and 255 gray value (in ASCII decimal). This totaled to 257 KB as the file size.
 
I enlarged the image by 5 times its original size. I then reflected the image horizontally. Finally, I rotated the image 90 degrees.
 
real 0m35.904s
user 0m0.520s
sys 0m22.781s
 
===== Flat Profile =====
 
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls ms/call ms/call name
29.70 0.30 0.30 Image::rotateImage(int, Image&)
25.74 0.56 0.26 3 86.67 86.67 Image::operator=(Image const&)
15.84 0.72 0.16 2 80.00 80.00 Image::Image(int, int, int)
14.85 0.87 0.15 writeImage(char*, Image&)
9.90 0.97 0.10 1 100.00 100.00 Image::Image(Image const&)
1.98 0.99 0.02 Image::enlargeImage(int, Image&)
1.98 1.01 0.02 Image::reflectImage(bool, Image&)
0.00 1.01 0.00 3 0.00 0.00 Image::~Image()
0.00 1.01 0.00 1 0.00 0.00 _GLOBAL__sub_I__ZN5ImageC2Ev
 
===== Call Graph =====
 
[[File:ImgPrcCallGraph.png]]
 
==== Assessment ====
 
Judging from the flat profile and call graph, rotateImage takes the longest to complete. EnlargeImage and reflectImage both take less time to complete than rotateImage. They also are completed at very similar times. Therefore, the best function to look into parallizing would be the rotateImage function.
 
----
 
=== Final Decision ===
 
We chose Ray Tracing as our main focus because...
 
== Assignment 2 ==
== Assignment 3 ==

Navigation menu