Changes

Jump to: navigation, search

BETTERRED

2,585 bytes added, 02:47, 26 February 2017
Hotspot
significantly. The code is straight forward and parallelization should be easy to implement.
=== Hotspot ==={| class="wikitable mw-collapsible mw-collapsed"! Culptit - BlurImage|-|<syntaxhighlight lang="cpp">void BlurImage(const SImageData& srcImage, SImageData &destImage, float xblursigma, float yblursigma, unsigned int xblursize, unsigned int yblursize){ // allocate space for copying the image for destImage and tmpImage destImage.m_width = srcImage.m_width; destImage.m_height = srcImage.m_height; destImage.m_pitch = srcImage.m_pitch; destImage.m_pixels.resize(destImage.m_height * destImage.m_pitch);
SImageData tmpImage;
tmpImage.m_width = srcImage.m_width;
tmpImage.m_height = srcImage.m_height;
tmpImage.m_pitch = srcImage.m_pitch;
tmpImage.m_pixels.resize(tmpImage.m_height * tmpImage.m_pitch);
 
// horizontal blur from srcImage into tmpImage
{
auto row = GaussianKernelIntegrals(xblursigma, xblursize);
 
int startOffset = -1 * int(row.size() / 2);
 
for (int y = 0; y < tmpImage.m_height; ++y)
{
for (int x = 0; x < tmpImage.m_width; ++x)
{
std::array<float, 3> blurredPixel = { { 0.0f, 0.0f, 0.0f } };
for (unsigned int i = 0; i < row.size(); ++i)
{
const uint8_t *pixel = GetPixelOrBlack(srcImage, x + startOffset + i, y);
blurredPixel[0] += float(pixel[0]) * row[i];
blurredPixel[1] += float(pixel[1]) * row[i];
blurredPixel[2] += float(pixel[2]) * row[i];
}
 
uint8_t *destPixel = &tmpImage.m_pixels[y * tmpImage.m_pitch + x * 3];
 
destPixel[0] = uint8_t(blurredPixel[0]);
destPixel[1] = uint8_t(blurredPixel[1]);
destPixel[2] = uint8_t(blurredPixel[2]);
}
}
}
 
// vertical blur from tmpImage into destImage
{
auto row = GaussianKernelIntegrals(yblursigma, yblursize);
 
int startOffset = -1 * int(row.size() / 2);
 
for (int y = 0; y < destImage.m_height; ++y)
{
for (int x = 0; x < destImage.m_width; ++x)
{
std::array<float, 3> blurredPixel = { { 0.0f, 0.0f, 0.0f } };
for (unsigned int i = 0; i < row.size(); ++i)
{
const uint8_t *pixel = GetPixelOrBlack(tmpImage, x, y + startOffset + i);
blurredPixel[0] += float(pixel[0]) * row[i];
blurredPixel[1] += float(pixel[1]) * row[i];
blurredPixel[2] += float(pixel[2]) * row[i];
}
 
uint8_t *destPixel = &destImage.m_pixels[y * destImage.m_pitch + x * 3];
 
destPixel[0] = uint8_t(blurredPixel[0]);
destPixel[1] = uint8_t(blurredPixel[1]);
destPixel[2] = uint8_t(blurredPixel[2]);
}
}
}
}
</syntaxhighlight>
 
|}
According to the Flat profile, 61.38% of the time is spent in the BlurImage function. This function contains a set of triply-nested for-loops which equates to a run-time of T(n) is O(n<sup>3</sup>).<br/>
Referring to the Call graph we can see more supporting evidence that this application spends nearly all of its execution time in the BlurImage function. Therefore this function is the prime candidate<br/>
147
edits

Navigation menu