BLAStoise
GPU610/DPS915 | Student List | Group and Project Index | Student Resources | Glossary
Contents
Project Name Goes here
Team Members
- Matt Babol, Sudoku
- Jonathan Desmond, Some other responsibility
- Sallie Jiang, Some other responsibility
_ _,..-"""--' `,.-". ,' __.. --', | _/ _.-"' | .' | | ____ ,.-""' `-"+.._| `.' | `-..,',--.`. | ,. ' j 7 l \__ |.-' /| | j|| . `. | / L`.`""','|\ \ `.,----..._ ,'`"'-. ,' \ `""' | | l Y `-----' v' ,'`,.__..' | . `. / / / `.| | `. / l j ,^. |L `._ L +. |._ .' \| | \ .`--...__,..-'""'-._ l L """ | | \ .' ,`-......L_ \ \ \ _.' ,'. l ,-"`. / ,-.---.' `. \ L..--"' _.-^.| l .-"".'"`. Y `._' ' `. | | _,.--'" | | `._' | |,-'| l `. | |".. | l ,'. | |`._' | `. | |_,...---"""""` L / | j _|-' `. L | j ,| | | `-,"._,-+' /`---^..../._____,.L',' `. |\ | |,' L | `-. | \j . \ `, | | \ __`.Y._ -. j | \ _.,' `._ \ | j ,-"`-----""""' |`. \ 7 | / `. ' | \ \ / | | ` / | \ Y | | \ . ,' | L_.-') L `. | / ] _.-^._ \ ,' `-7 ,-' / | ,' `-._ _,`._ `. _,-' ,',^.- `. ,-' v.... _.`"', _:'--....._______,.-' ._______./ /',,-'"'`'--. ,-' `. """""`.,' _\`----...' --------""'
Progress
Assignment 1
Sudoku Solver by Matt B.
Sudoku solver is a program that solves a sudoku puzzle. The user can input an existing file and have the program solve this, or can manually enter values to be solved. The sudoku puzzle is 9x9 in size. The data needs to be in a specific format for the program to work. There are 9 rows of values, with each cell/element in the row needing to be separated by a space. A value of 0 tells the program to solve this value.
Original source code can be found here.
Easy puzzle
To compile the program, open the terminal and go to the projects directory
$ g++ -std=c++0x -pg solver.cpp checks.cpp checksolution.cpp -o Sudoku
This will create an executable file called Sudoku. -pg is used for creating a gmon.out file, which will allow us to profile the program with arguments.
This is the easy sudoku puzzle that will be running through the program first. The file is saved as 'puzzle' in the same directory.
0 6 0 0 0 0 9 7 2 0 5 0 0 0 2 0 0 3 0 7 0 3 9 0 5 0 0 2 0 0 0 0 5 4 0 8 0 0 0 0 0 0 0 0 0 3 0 1 8 0 0 0 0 6 0 0 4 0 2 3 0 8 0 7 0 0 9 0 0 0 2 0 9 2 5 0 0 0 0 4 0
Run the code with
$ ./Sudoku puzzle
After the program is done running, the result is
1 6 3 4 5 8 9 7 2 4 5 9 7 1 2 8 6 3 8 7 2 3 9 6 5 1 4 2 9 7 1 6 5 4 3 8 5 8 6 2 3 4 1 9 7 3 4 1 8 7 9 2 5 6 6 1 4 5 2 3 7 8 9 7 3 8 9 4 1 6 2 5 9 2 5 6 8 7 3 4 1
Test Case
To profile the program, run this command.
$ gprof -p -b ./Sudoku gmon.out > Sudoku.flt
The profiling result
Flat profile: Each sample counts as 0.01 seconds. no time accumulated % cumulative self self total time seconds seconds calls Ts/call Ts/call name 0.00 0.00 0.00 4539 0.00 0.00 checkRow(int, int) 0.00 0.00 0.00 1620 0.00 0.00 checkColumn(int, int) 0.00 0.00 0.00 1120 0.00 0.00 placeNum(int, int) 0.00 0.00 0.00 698 0.00 0.00 checkSquare(int, int, int) 0.00 0.00 0.00 476 0.00 0.00 goBack(int&, int&) 0.00 0.00 0.00 2 0.00 0.00 print(int (*) [9]) 0.00 0.00 0.00 1 0.00 0.00 _GLOBAL__sub_I_sudoku 0.00 0.00 0.00 1 0.00 0.00 _GLOBAL__sub_I_temp 0.00 0.00 0.00 1 0.00 0.00 solveSudoku() 0.00 0.00 0.00 1 0.00 0.00 storePositions() 0.00 0.00 0.00 1 0.00 0.00 __static_initialization_and_destruction_0(int, int) 0.00 0.00 0.00 1 0.00 0.00 __static_initialization_and_destruction_0(int, int)
Hard puzzle
For the hard puzzle, below is the input file as well as the result
0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 8 5 0 0 1 0 2 0 0 0 0 0 0 0 5 0 7 0 0 0 0 0 4 0 0 0 1 0 0 0 9 0 0 0 0 0 0 0 5 0 0 0 0 0 0 7 3 0 0 2 0 1 0 0 0 0 0 0 0 0 4 0 0 0 9 9 8 7 6 5 4 3 2 1 2 4 6 1 7 3 9 8 5 3 5 1 9 2 8 7 4 6 1 2 8 5 3 7 6 9 4 6 3 4 8 9 2 1 5 7 7 9 5 4 6 1 8 3 2 5 1 9 2 8 6 4 7 3 4 7 2 3 1 9 5 6 8 8 6 3 7 4 5 2 1 9
The profiling results are
Flat profile: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls s/call s/call name 47.27 7.83 7.83 622577597 0.00 0.00 checkRow(int, int) 18.63 10.92 3.09 223365661 0.00 0.00 checkColumn(int, int) 14.18 13.27 2.35 157353814 0.00 0.00 placeNum(int, int) 11.49 15.17 1.90 100608583 0.00 0.00 checkSquare(int, int, int) 4.29 15.89 0.71 69175252 0.00 0.00 goBack(int&, int&) 3.93 16.54 0.65 1 0.65 16.54 solveSudoku() 0.21 16.57 0.04 1 0.04 0.04 _GLOBAL__sub_I_sudoku 0.00 16.57 0.00 2 0.00 0.00 print(int (*) [9]) 0.00 16.57 0.00 1 0.00 0.00 _GLOBAL__sub_I_temp 0.00 16.57 0.00 1 0.00 0.00 storePositions() 0.00 16.57 0.00 1 0.00 0.00 __static_initialization_and_destruction_0(int, int) 0.00 16.57 0.00 1 0.00 0.00 __static_initialization_and_destruction_0(int, int)
Oil Painting By Sallie J.
This program converts a regular image into a stylized oil painting, it uses OpenCV. The painting algorithm depends on the brush size and colour intensity. The program takes three command line arguments: int for brush size, int for intensity and file name of an image. Upon finishing, the program produces the original image along with the oil paint version and the total time required in seconds.
The original source code can be found here. However there have been some changes to make testing and profiling slightly easier. (Mainly changes are putting the for-loop logic into a function outside of the main and modifying for command line arguments instead of hard coding values.)
To profile the program, run this command.
$
The time required for the program depends largely on the file size being converted. Around 5 seconds for a 50KB image and 100 seconds for a 1MB image. It depends on the brush size and intensity levels as well.
Analysis
The profiling revealed that 99% of the processing time is spent in the paint function where the for-loop logic is located. Within that 99% the program spends roughly 2/3rds of its time reading accessing data through the "at" function of the OpenCV Mat class (n-dimensional dense array class). The other 1/3 is spent on direct access through OpenCV’s Vec class (short numerical vectors). The for-loop is structured divides the picture up based on brush size. Then it finds the colour for each pixel in that section. Finally, it then averages the intensity to produce the final colour of that group of pixels. This is what makes this program ideal for parallelizing, because each iteration of this for-loop is calculating the final colours for each pixel. (SIMD type of process, the single instruction is to find the final colour and the multiple data is the pixels.)
//Simplified for-loop structure for (int y = BrushSize; y < (height - BrushSize); y++) //for each row based on brush size { for (int x = BrushSize; x < (width - BrushSize); x++) //for each column in brush size { for (int j = -BrushSize; j <= BrushSize; j++) //for each pixel row in one brush size grouping { for (int i = -BrushSize; i <= BrushSize; i++)//for each pixel column in one brush size grouping { //algorithm and logic for colour calculations } } } }