Changes

GPU610 Team Tsubame

180 bytes added, 13:41, 12 April 2017

→‎PHASE 3

== PHASE 3 ==

* '''Optimize attempt1:'''

We ~~have~~ tried to optimize the maze program~~. However, the result~~ by removing some of ~~optimization did not improve~~ the ~~speed of~~ if statements to reduce thread divergence, however the ~~program. In fact, it~~ attempt made the program slower than the ~~parallelization but~~ phase 2 version and only a little bit faster than the serialversion.

~~Instead, we decided to use share memory in the GPU to improve the speed of the program. However, the maze image does not showing correctly.~~'''New Kernels'''

* New Kernel // Initialize all pixels to black (hex 000)

__global__ void k_drawWalls(png_byte* rows, const short* cells, const int width, const int height, const int len, const int size) {

int i = blockIdx.x * blockDim.x + threadIdx.x;

}

// Set pixels to white according to the pattern the cell belongs to

__global__ void k_drawPaths(png_byte* rows, const short* cells, const int width, const int height, const int len) {

int i = blockIdx.x * blockDim.x + threadIdx.x;

}

~~Because there were too many else if condition~~ [[File:SPODiagram.PNG]] '''Analysis:''' For this attempt, the kernel executes for each cell instead of for each byte (in ~~the old Kernel~~phase 2); it is no longer processing more than 1 pixel in each cell for every thread, ~~we rewrite~~ which may be the ~~Kernel~~ cause to ~~avoid divergent~~the longer processing time.

~~[[File~~'''Optimize attempt 2:~~SPODiagram.PNG]]~~'''

*Sum up: **Still too many if statement We decided to use shared memory in the ~~Kernel~~**Almost GPU to improve the ~~same~~ speed as of the ~~serial~~program. However, the maze image does not show correctly.

== Presentation ==

Yanhao Lei

240

edits

CDOT Wiki β

Changes

GPU610 Team Tsubame

CDOT Wiki ^β