Optimize c++ bitmap processing algorithm -


i have written next algorithm (for android/ndk) apply levels bitmap. problem is slow, on fast device such sgsiii can take 4 seconds 8mp image. , on devices armv6 takes ages (over 10 seconds). there way optimize it?

void applylevels(unsigned int *rgb, const unsigned int width, const unsigned int height, const float exposure, const float brightness, const float contrast, const float saturation) {     float r, g, b;      unsigned int pixelindex = 0;      float exposurefactor   = powf(2.0f, exposure);     float brightnessfactor = brightness / 10.0f;     float contrastfactor   = contrast > 0.0f ? contrast : 0.0f;      (int y = 0; y < height; y++)     {         (int x = 0; x < width; x++)         {             const int pixelvalue = buffer[pixelindex];              r = ((pixelvalue & 0xff0000) >> 16) / 255.0f;             g = ((pixelvalue & 0xff00) >> 8) / 255.0f;             b = (pixelvalue & 0xff) / 255.0f;              // clamp values              r = r > 1.0f ? 1.0f : r < 0.0f ? 0.0f : r;             g = g > 1.0f ? 1.0f : g < 0.0f ? 0.0f : g;             b = b > 1.0f ? 1.0f : b < 0.0f ? 0.0f : b;              // exposure              r *= exposurefactor;             g *= exposurefactor;             b *= exposurefactor;              // contrast              r = (((r - 0.5f) * contrastfactor) + 0.5f);             g = (((g - 0.5f) * contrastfactor) + 0.5f);             b = (((b - 0.5f) * contrastfactor) + 0.5f);              // saturation              float gray = (r * 0.3f) + (g * 0.59f) + (b * 0.11f);             r = gray * (1.0f - saturation) + r * saturation;             g = gray * (1.0f - saturation) + g * saturation;             b = gray * (1.0f - saturation) + b * saturation;              // brightness              r += brightnessfactor;             g += brightnessfactor;             b += brightnessfactor;              // clamp values              r = r > 1.0f ? 1.0f : r < 0.0f ? 0.0f : r;             g = g > 1.0f ? 1.0f : g < 0.0f ? 0.0f : g;             b = b > 1.0f ? 1.0f : b < 0.0f ? 0.0f : b;              // store new pixel value              r *= 255.0f;             g *= 255.0f;             b *= 255.0f;              buffer[pixelindex] = ((int)r << 16) | ((int)g << 8) | (int)b;              pixelindex++;         }     } } 

most of computations can trivially tabled... whole processing can become

for (int i=0; i<n; i++) {     int px = buffer[i];     int r = tab1[(px >> 16) & 255];     int g = tab1[(px >> 8) & 255];     int b = tab1[px & 255];     gray = (kr*r + kg*g + kb*b) >> 16;     grayval = tsat1[gray];     r = brtab[tsat2[r] + grayval];     g = brtab[tsat2[g] + grayval];     b = brtab[tsat2[b] + grayval];     buffer[i] = (r << 16) | (g << 16) | b; } 

where

  • tab1 table of 256 bytes tabling result of exposure , constrast processing
  • tsat1 , tsat2 256 bytes tables saturation processing
  • brtab 512-bytes table brightness processing

note without saturation processing need lookup per component in 256 bytes table.

a huge speed problem can because using floating-point computations there no dedicated hardware it. software implementation of floating point slow.


Comments

Popular posts from this blog

How to mention the localhost in android -

php - Calling a template part from a post -

c# - String.format() DateTime With Arabic culture -