Changes

Jump to: navigation, search

UnknownX

72 bytes added, 05:50, 13 April 2017
Assignment 2 - V1 Parallelization
CPU code:
 
The most expensive part in the program.
Main code on .cu:
 
1. Allocate memory on device.
 
2. run kunal. ntpb = 1024.
 
3. copy the key data out.
Kernel:
 
before:
for (int y = 0; y < N; ++y)
pixs_z[y * N + x] = (int)pix_col.z;
}
 
Profile on nvvp:
[[File:matrix.senecac.on.ca/~zzha1/Capture.PNG]]
== Assignment 3 - Optimization ==
51
edits

Navigation menu