Open main menu

CDOT Wiki β

Changes

GroupNumberUndefined

845 bytes added, 19:31, 11 April 2017
Assignment 3
Upon further inspection of the function and kernel, I realized that the array of pixels taken from the oldImage was never used inside the kernel, so it was removed entirely. This include the removal of its memory allocation and the copying of the array from host to device, further reducing the run time of the function.
 
[[File:hArrayRemove.jpg]]
Furthermore, I previously put the "check for bounds" calculation and the "fill in empty pixels" calculation inside two separate nested for-loops. I have combined them into one, removing one nested for loops which will increase performance dramatically.
 
[[File:NestedCombined.jpg]]
 
Overall, this is what the optimized rotateImage() function and the rotate() kernel looks like:
 
[[File:OptimizedFunction.jpg]]
 
Some calculation previously done inside the kernel (finding the center of images and finding radians calculation) were moved to outside the kernel and its value passed in. Kernel:
 
[[File:OptimizedKernel.jpg]]
 
Profiling with the same images gives the following result.
 
[[File:OptimizedChart.jpg]]
 
For optimization of the enlarge function, there are not a lot of options in which it can be optimized, only choice I did was to put some of the calculations into a register, which is the resulting image showing the final copy of the enlarge function. There were no significant improvements in the performance, not worth documenting.
 
[[File:OptimzedFunc.PNG]]
62
edits