1
edit
Changes
→Kernel Attempts
-Currently I am trying to get the above kernels to work before handing in the assignment as I feel that having just the initialization kernel would not be nearly sufficient for the purpose of this assignment.
-At the moment I am working on various simplified versions of a prefix sum algorithm that I am hoping will lead me on the right path to completing my assignment. These algorithms have been gathered from various sources such as MIT, NVIDIA, as well as CUDA documentation.
-Below is a sequential prescan used to perform a prescan on an array.
<source lang="cpp">
void scan( float* arr1, float* input, int n) {
output[0] = 0; // since this is a prescan, not a scan
for(int i = 1; i < length; ++i) {
arr1[i] = input[i-1] + arr1[i-1];
}
}
</source>
=== Assignment 3 ===