Changes

Jump to: navigation, search

GPU610 Team Tsubame

180 bytes added, 13:41, 12 April 2017
PHASE 3
== PHASE 3 ==
* '''Optimize attempt1:'''
We have tried to optimize the maze program. However, the result by removing some of optimization did not improve the speed of if statements to reduce thread divergence, however the program. In fact, it attempt made the program slower than the parallelization but phase 2 version and only a little bit faster than the serialversion.
Instead, we decided to use share memory in the GPU to improve the speed of the program. However, the maze image does not showing correctly.'''New Kernels'''
* New Kernel // Initialize all pixels to black (hex 000)
__global__ void k_drawWalls(png_byte* rows, const short* cells, const int width, const int height, const int len, const int size) {
int i = blockIdx.x * blockDim.x + threadIdx.x;
}
// Set pixels to white according to the pattern the cell belongs to
__global__ void k_drawPaths(png_byte* rows, const short* cells, const int width, const int height, const int len) {
int i = blockIdx.x * blockDim.x + threadIdx.x;
}
Because there were too many else if condition [[File:SPODiagram.PNG]] '''Analysis:''' For this attempt, the kernel executes for each cell instead of for each byte (in the old Kernelphase 2); it is no longer processing more than 1 pixel in each cell for every thread, we rewrite which may be the Kernel cause to avoid divergentthe longer processing time.
[[File'''Optimize attempt 2:SPODiagram.PNG]]'''
*Sum up: **Still too many if statement We decided to use shared memory in the Kernel**Almost GPU to improve the same speed as of the serialprogram. However, the maze image does not show correctly.
== Presentation ==
240
edits

Navigation menu