Changes

GPU610 Team Tsubame

1 byte removed, 13:46, 12 April 2017

→‎Attempt 1:

We tried to optimize the maze program by removing some of the if statements to reduce thread divergence, however the attempt made the program slower than the phase 2 version and only a little bit faster than the serial version.

'''New Kernels:'''

// Initialize all pixels to black (hex 000)

__global__ void k_drawWalls(png_byte* rows, const short* cells, const int width, const int height, const int len, const int size) {

'''Analysis:'''

For this attempt, the kernel executes for each cell instead of for each byte (in phase 2); it is no longer processing more than 1 pixel in each cell for every thread, which may be the cause to the longer processing time.

=== Attempt 2: ===

We decided to use shared memory in the GPU to improve the speed of the program. However, the maze image does not show correctly: the paths were showing as randomly coloured pixels; the cause is due to the threads setting only a part (1/3, 2/3) of a pixel's values to hexadecimal F.

Yanhao Lei

240

edits

Changes

GPU610 Team Tsubame

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

get involved with CDOT

courses

course projects

links

Tools