Click here to Skip to main content
15,558,870 members
Home / Discussions / C / C++ / MFC
   

C / C++ / MFC

 
GeneralRe: RichEdit Controls / Embedded Objects Pin
Steve Thresher30-Sep-09 14:54
Steve Thresher30-Sep-09 14:54 
GeneralRe: RichEdit Controls / Embedded Objects Pin
Richard MacCutchan30-Sep-09 14:59
mveRichard MacCutchan30-Sep-09 14:59 
GeneralRe: RichEdit Controls / Embedded Objects Pin
Steve Thresher30-Sep-09 15:05
Steve Thresher30-Sep-09 15:05 
GeneralRe: RichEdit Controls / Embedded Objects but seriously Pin
Richard MacCutchan1-Oct-09 0:26
mveRichard MacCutchan1-Oct-09 0:26 
GeneralRe: RichEdit Controls / Embedded Objects but seriously Pin
Steve Thresher1-Oct-09 1:03
Steve Thresher1-Oct-09 1:03 
GeneralRe: RichEdit Controls / Embedded Objects but seriously Pin
Gary R. Wheeler1-Oct-09 2:19
Gary R. Wheeler1-Oct-09 2:19 
GeneralRe: RichEdit Controls / Embedded Objects Pin
Dan Neely1-Oct-09 3:25
Dan Neely1-Oct-09 3:25 
Questionfloat vs double - tricky performance issue [modified] Pin
Chris Losinger30-Sep-09 11:50
professionalChris Losinger30-Sep-09 11:50 
(this is a long one, i apologize)

i'm working on an image filter. the basic operation is:

1. run across a row of pixels (one BYTE per component); perform a calculation and store the result in an array of floats. the output array is the same width as a pixel row.

2. run across the same row, but in the opposite direction; perform a similar (but not identical) calculation. store the results in a different array of floats.

3. sum each pair from the two float arrays and store the result in a temporary float image of the same dimensions as the input image, but rotate the output 90 degrees. in other words: output rows as columns in the temp image.

4. using the same basic method as 1 and 2, process the floating point temp image. sum the results and write them to an output 8-bit image.

so, again: there are two calculation loops and a summation loop per row. after all input rows are done, the process is repeated using the temp as input.

Loop over 8-bit Rows
   Loop over Column[y] left to right => tempRow1
   Loop over Column[y] right to left => tempRow2
   tempRow1 + tempRow2 => temp f.p. image row[y] // 90 deg rotation

Loop over temp f.p. Rows
   Loop over Column[y] left to right => tempRow1
   Loop over Column[y] right to left => tempRow2
   tempRow1 + tempRow2 => output 8-bit image row[y] // -90 deg rotation


just for reference, here's the first calculation loop:

// first four pixels are handled outside the loop.
// pCurPix is a pointer to the source image (BYTEs, variable number of channels).
// pTempPos is an array of floating point values
// the two factor arrays are floating points, too.
for (x=4;x < width;x++)
{
   pTempPos[x] =
     numerator_factors_positive[0] * pCurPix[0]   
   + numerator_factors_positive[1] * pCurPix[-iChannels]
   + numerator_factors_positive[2] * pCurPix[-2 * iChannels]
   + numerator_factors_positive[3] * pCurPix[-3 * iChannels]
   - denominator_factors_positive[1] * pTempPos[x - 1] 
   - denominator_factors_positive[2] * pTempPos[x - 2]
   - denominator_factors_positive[3] * pTempPos[x - 3] 
   - denominator_factors_positive[4] * pTempPos[x - 4];

   pCurPix+=uChannels;
}


that's the first loop. the third loop looks identical, except that pCurPix is a pointer to a floating point value in the third loop - it is a BYTE pointer here. the 2nd and 4th loops are very similar to that and are also identical to each other - again, except for the pCurPix data type.

also, i wrote this code as a template so i can switch the type of the floating point data from float to double, for testing (the "factor" arrays, the temp row buffers and the temp image).

a little more info:

one of the parameters ("sigma") to the function is used to set the values in those factors array. and the algorithm is constant complexity with respect to that parameter - sigma changes the values that pixels are multiplied by, not the number of times the calculations happen. the only thing that influences the amount of calculations performed is the size of the input image. in theory...

another parameter is the number of color channels in the input image (1=grayscale, 3=RGB, 4=RGBA, etc.)

and finally here's the problem! Laugh | :laugh: :

when:
1. the class is using floats for the f.p. data type
2. the image is a single channel
3. the value of sigma is near 3...

the third loop (again, which looks exactly like what i've posted above) slows down to where it's literally ten times slower than all the other loops. as i move sigma away from three, the performance quickly goes to where it should : by 6, loop #3 is as fast as the rest, and they all stay the same speed as far as i can tell, for all other values of sigma: 50 is as fast as 6, and 70 is as fast as 6.

so it would seem the solution is to use doubles. but an array of doubles is 2x as big as an array floats. and, even worse, the float version of this is 2x faster than the double version ! Dead | X|

here are the overall timings for the float version (sigma 1st col, time for 50 reps, 2nd col):

0.10 0.44
0.60 0.56
1.10 0.76
1.60 0.80
2.10 1.29
2.60 1.48 -- spike, around 3.0
3.10 1.44
3.60 1.25
4.10 0.97
4.60 0.66
5.10 0.45
5.60 0.39
6.10 0.34
6.50 0.34

.. and then it stays at 0.34s until sigma = 93.5, when it totally blows up and does this:

93.00 0.34
93.10 0.36
93.20 0.34
93.30 0.34
93.40 0.34
93.50 27.22
93.60 27.38
93.70 27.19
93.80 0.34
93.90 0.36
94.00 0.36

these timings are on a Core2 2.4MHz. but i can duplicate the slowdown on a single-core Pentium D 2.8.

anybody have any idea what could be going on?

update: ok, 93.5 issue is when some of the array values go to infinity... i don't see anything like that at 3.0, though.


modified on Wednesday, September 30, 2009 4:59 PM

AnswerRe: float vs double - tricky performance issue Pin
harold aptroot30-Sep-09 12:15
harold aptroot30-Sep-09 12:15 
GeneralRe: float vs double - tricky performance issue Pin
Chris Losinger30-Sep-09 12:19
professionalChris Losinger30-Sep-09 12:19 
AnswerRe: float vs double - tricky performance issue Pin
TimothyPMoore30-Sep-09 12:56
TimothyPMoore30-Sep-09 12:56 
GeneralRe: float vs double - tricky performance issue [modified] Pin
Chris Losinger30-Sep-09 13:01
professionalChris Losinger30-Sep-09 13:01 
GeneralRe: float vs double - tricky performance issue Pin
TimothyPMoore30-Sep-09 14:02
TimothyPMoore30-Sep-09 14:02 
GeneralRe: float vs double - tricky performance issue Pin
Rick York30-Sep-09 14:29
mveRick York30-Sep-09 14:29 
GeneralRe: float vs double - tricky performance issue Pin
Chris Losinger30-Sep-09 14:38
professionalChris Losinger30-Sep-09 14:38 
QuestionEnumerated Type Not Comparing Correctly Pin
Jim Fell30-Sep-09 9:49
Jim Fell30-Sep-09 9:49 
AnswerRe: Enumerated Type Not Comparing Correctly Pin
includeh1030-Sep-09 9:56
includeh1030-Sep-09 9:56 
GeneralRe: Enumerated Type Not Comparing Correctly Pin
Jim Fell30-Sep-09 10:02
Jim Fell30-Sep-09 10:02 
QuestionRe: Enumerated Type Not Comparing Correctly Pin
CPallini30-Sep-09 10:23
mveCPallini30-Sep-09 10:23 
AnswerRe: Enumerated Type Not Comparing Correctly Pin
Jim Fell30-Sep-09 10:58
Jim Fell30-Sep-09 10:58 
GeneralRe: Enumerated Type Not Comparing Correctly Pin
CPallini30-Sep-09 11:57
mveCPallini30-Sep-09 11:57 
AnswerRe: Enumerated Type Not Comparing Correctly Pin
Jim Fell30-Sep-09 10:59
Jim Fell30-Sep-09 10:59 
QuestionIs exe file aligned in size? Pin
includeh1030-Sep-09 6:09
includeh1030-Sep-09 6:09 
AnswerRe: Is exe file aligned in size? Pin
Joe Woodbury30-Sep-09 7:01
professionalJoe Woodbury30-Sep-09 7:01 
GeneralRe: Is exe file aligned in size? Pin
includeh1030-Sep-09 9:46
includeh1030-Sep-09 9:46 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.