Open main menu

CDOT Wiki β

Changes

Sirius

700 bytes removed, 09:41, 9 April 2018
Assignment 3
The application has the opportunity to receive an incredible boost to performance with the addition of parallel programming as most of the computational time is made up of calculating the average of every pixel which can be calculated concurrently, while only requiring a single synchronization at the end before we display the image.
 
=== Source Code for Box Blur ===
<syntaxhighlight lang="cpp">
int findingNeighbors(Mat img, int i, int j, int neighbour,float * b, float * g, float * r) {
int row_limit = img.rows;
int column_limit = img.cols;
Scalar temp;
double sum = 0, blue=0, red=0, green=0;
 
for (int x = i - floor(neighbour / 2); x <= i + floor(neighbour / 2); x++) {
for (int y = j - floor(neighbour / 2); y <= j + floor(neighbour / 2); y++) {
if (x >= 0 && y >= 0 && x < row_limit && y < column_limit) {
temp = img.at<Vec3b>(x, y);
blue += temp.val[0];
green += temp.val[1];
red += temp.val[2];
}
}
}
*b = blue / pow(neighbour, 2);
*g = green / pow(neighbour, 2);
*r = red / pow(neighbour, 2);
return 1;
}
</syntaxhighlight>
=== Algorithms (Joseph Pildush)===
<syntaxhighlight lang="cpp>
int iDevice; cudaDeviceProp prop; cudaGetDevice(&iDevice); cudaGetDeviceProperties(&prop, iDevice); int resident_threads = prop.maxThreadsPerMultiProcessor; int resident_blocks = 8; if (prop.major >= 3 && prop.major < 5) { resident_blocks = 16;
}
else if (prop.major >= 5 && prop.major <= 6) { resident_blocks = 32; } //determine threads/block dim3 blockDims(resident_threads/resident_blocks,1,1);
//Calculate grid size to cover the whole image dim3 gridDims(pixels/blockDims.x);
</syntaxhighlight>
96
edits