|
Well, I've just answered, but I do not see it... Well, let's go point by point
Stefan_Lang wrote:
You should really look at the problem again and consider some or all of the following questions:
1. Where did you get the information that
- _lrct$ refers to the start address of the RECT struct you're referring to
What do you think .asm files exist for? Compiler made them and I was able to see, where local and temporary variables was allocated.
- the RECT struct is really 16 bytes in size
Nice question, it comes to my mind too- but the same compiler was so kind, that showed me the size of a RECT structure
- the offsets you show are in fact relative to the same base address
You should know, that ebp register remains constant within any procedure! And in this particular case all local and temporary variables was addressed via ebp.
2. Did you derive from your observation of these addresses that part of your data is overwritten, or did you check the actual RECT structure to verify that?
I saw it, running my prog under debugger and I know which command (fst) overwrote aforementioned RECT (in fact, only right and bottom).
3. What are the original declarations of the C/C++ symbols corresponding to the two addresses _lrct$ and tv5476?
RECT lrct; the second variable is local storage, using to temporary storage of ST(0)
4. How were the two objects allocated?
I wrote it, but I can repeat it again. _lrct=-212 and tv5476=-204 (number a decimal ones!) It means, that command fst tv5476[ebp] overwrites 8 bytes- one half of lrct structure.
5. Are you sure that one of them (the RECT) hasn't been deallocated in the meantime? Note that optimizers may discard variables before the end of their lifetime as seen in code if they realize it is no longer used!
It would be possible, but in such case debugger refuses to show such variable and gives message like "expression cannot be evaluated". Moreover, compiler wouldn't let me to use a variable outside of the block, where it was declared.
I'm sure I could think of more questions, but this could be much more productive if we could see the actual code ...
You saw it. Answer a question, originally addressed to Richard- which variant (and why) refuses to work properly.
|
|
|
|
|
1.a) I realize that one is from the generated assembler code , but "I am able to see" is not an answer to my question. But never mind, you later wrote the original symbol is named _lrct, which I consider sufficient information at this point.
Considering the fact that symbols starting with '_' are common within windows system libaries and the MFC, and 'rct' is a common name fragment used for windows rectangle types and variables, I presume this is the Windows RECT struct[^] that you're talking about?
1b) Are you saying you used the sizeof() function? That would be the only way I know of the compiler can 'tell' you as much. Or if you derived it from bits of the assembler code, I have to trust your word for it - I don't don't know how to extract that kind of information.
1c) it could have been different procedures, or even different threads. Before your more recent answers there was no way to tell. So this isn ot (part of) the cause of the problem.
2 As mentioned in another post: you can't trust debugger output in optimized code! Some variables may not be stored in the stack at all, others may be overwritten before they expire, and addresses used to view a particular element may not contain the correct value, or the most recent state of the variable, due to caching, or memory optimization. The only way to be sure of the actual, current state of a variable, is print it out to the console or maybe log file.
4 I meant memory allocation, as in whether they are on the stack, the heap, or temporaries. But in light of the other responses this is no longer important.
5: see 2. - Don't trust the debugger in optimized code.
I still have some doubts the compiler has a bug - the symptom you describe is just too obvious. My guess is that it's a result of optimization. But with your responses you excluded a number of possible alternate causes. If you still feel it is a genuine bug you could report it to MS. However, I'm not sure whether they'll look into an 8 year old compiler when they have various newer versions to offer: if they find a bug, they'll probably only fix it in the newest version(s).
GOTOs are a bit like wire coat hangers: they tend to breed in the darkness, such that where there once were few, eventually there are many, and the program's architecture collapses beneath them. (Fran Poretto)
|
|
|
|
|
Stefan_Lang wrote:
1.a) I realize that one is from the generated assembler code , but "I am able to see" is not an answer to my question. But never mind, you later wrote the original symbol is named _lrct, which I consider sufficient information at this point.
Not exactly. In my code was a simple line like
RECT lrct; compiler converted it into
tv6649 = -332 ; size = 4
_lrct$ = -328 ; size = 16
_sf$204092 = -312 ; size = 8
There is no "sizeof"; yeah this operator made me problems once or twice. Look at how here lrct is placed- just correctly (it is another procedure! I rewrote that one, which caused problems, but as I wrote, there was difference not 16 bytes, but only 8. (I cannot show you the code, which caused problem- I rewrote it. Wrong allocation can be found now only here, in this tread)
Considering the fact that symbols starting with '_' are common within windows system libaries and the MFC, and 'rct' is a common name fragment used for windows rectangle types and variables, I presume this is the Windows RECT struct[^] that you're talking about?
Positive. But _ (and $ as well) was added by compiler.
1b) Are you saying you used the sizeof() function? That would be the only way I know of the compiler can 'tell' you as much. Or if you derived it from bits of the assembler code, I have to trust your word for it - I don't don't know how to extract that kind of information.
I did not tell it- look up. Compiler wrote its size as a commentary in asm listing.
1c) it could have been different procedures, or even different threads. Before your more recent answers there was no way to tell. So this isn ot (part of) the cause of the problem.
2 As mentioned in another post: you can't trust debugger output in optimized code! Some variables may not be stored in the stack at all, others may be overwritten before they expire, and addresses used to view a particular element may not contain the correct value, or the most recent state of the variable, due to caching, or memory optimization. The only way to be sure of the actual, current state of a variable, is print it out to the console or maybe log file.
What makes you think so? I watch at assembler code (well plus the source one), when I run release under debugger. And it demonstrates me just what it does.
4 I meant memory allocation, as in whether they are on the stack, the heap, or temporaries. But in light of the other responses this is no longer important.
5: see 2. - Don't trust the debugger in optimized code.
I do. But as I wrote, I use assembler in such case. Without it I couldn't to find, where the problem is.
If one try to debug release, using source code only, he will be really surprised.
I still have some doubts the compiler has a bug - the symptom you describe is just too obvious. My guess is that it's a result of optimization. But with your responses you excluded a number of possible alternate causes. If you still feel it is a genuine bug you could report it to MS. However, I'm not sure whether they'll look into an 8 year old compiler when they have various newer versions to offer: if they find a bug, they'll probably only fix it in the newest version(s).
I'm sure that they do not. And I sure, that it is a bug- what else? Optimization, yeah, but it could be buggy too. Question however still remains, why this but affects only in one particular procedure. I saw a bug in Borland's compiler. This one was quite cute- I wrote some assembler code, but it removed one command from there.
|
|
|
|
|
Interesting bit about the size comments. I don't recall ever seeing that - maybe a different setting...
I trust that by now you realized that it is hard for us to reproduce the exact situation that led to the problem.
I still hold that it's dangerous to deduce anything from optimized code, short of print statements or similar right in the original code: you'll never know just what the compiler did, and why, to optimize your memory footprint and performance. It's difficult to draw correlations from assembler to the original code, and the expectations that come with it.
The code you posted doesn't look very complex. About the only optimization I would anticipate is that some of the local variables would be stored in register only, rather than on the stack. But then, optimizers work in mysterious ways - you'll never know what kind of optimization they can come up with until you see the code...
GOTOs are a bit like wire coat hangers: they tend to breed in the darkness, such that where there once were few, eventually there are many, and the program's architecture collapses beneath them. (Fran Poretto)
|
|
|
|
|
Interesting bit about the size comments. I don't recall ever seeing that - maybe a different setting...
Which compiler do you use? I only asked it to generate asm file with source code. All the rest was default settings (Sorry, later I added /Zi to generate debugger's symbols in release version. But it does not affect on generation of asm listing)
I trust that by now you realized that it is hard for us to reproduce the exact situation that led to the problem.
I do. You have no choice but to trust me (and stop looking where is light!).
I still hold that it's dangerous to deduce anything from optimized code, short of print statements or similar right in the original code: you'll never know just what the compiler did, and why, to optimize your memory footprint and performance. It's difficult to draw correlations from assembler to the original code, and the expectations that come with it.
But I do know exactly, what compiler does!!! What do you think asm listing exists for? Correlation is very simple and there are no problems with it. No matter, which sort of optimization is in use, resulting exe file must do the same things, and no matter, which optimization was used.
The code you posted doesn't look very complex. About the only optimization I would anticipate is that some of the local variables would be stored in register only, rather than on the stack. But then, optimizers work in mysterious ways - you'll never know what kind of optimization they can come up with until you see the code...
I saw. And without it I would never find, where the problem is.
GOTOs are a bit like wire coat hangers: they tend to breed in the darkness, such that where there once were few, eventually there are many, and the program's architecture collapses beneath them. (Fran Poretto)
Even Straustrup claims, that there are situation, when goto is useful. Assume you need get out of nested loops...
do {
do {
do {
if(something is wrong)
goto Skip_loops;
} while (cond A)
} while(cond B)
} while(cond C)
Skip_loops:;
|
|
|
|
|
a_matseevsky wrote: Any idea how could it be?
Perhaps nonstandard compiler options or linker options.
And I am rather certain that max() is a macro.
Which means that the only possible source of the problem would be an optimization.
Excluding of course some pointer bug, which could cause almost any problem.
And I am also certain that you could write a trivially small program that would demonstrate this, which is basically what has already been asked. If you cannot in fact demonstrate the same problem with the trivial program then it would point to some other problem in your application.
|
|
|
|
|
There is no way to demonstrate problem, if compiler is unstable. problem disappeared, when I replaced max() and min(), but occurred again after some insignificant change in code (which was made far from the procedure, where problem occurred!). I used default compiler's options. Later, when I was looking for the source of the problem, I created database and set /Zi switch for the release version. After these actions I was able to find the very command, which destroyed my data. The next was trivial- I find offsets of local variables and noticed, that some of them was incorrectly placed (one partially overlaps another). This is the bug in compiler, not mine.
|
|
|
|
|
a_matseevsky wrote: This is the bug in compiler, not mine. Then go and tell Microsoft about it, as I already suggested, since you refuse to show us any of the code which causes the problem.
Veni, vidi, abiit domum
|
|
|
|
|
I will. I've just registered at M$'s site. And I got you all info to understand, where the problem is. For me situation is clear. What I got, running release under debugger:
lrct {top=30 bottom=958 left=162 right=750}
0041A88F fst qword ptr [ebp-0D4h]
lrct {top=30 bottom=1080213504 left=162 right=0}
Here you may see, what made one single command- RECT structure before and after this command. What additional info do you need? Prefix "tv" definitely means Temporarily Variables. This variable was placed too close to storage of RECT structure and fst overwrote it. If you cannot see it, I wash my hands. If you do not know, which bytes fst overwrite, sorry, man- there are problems, which obviously cannot be solved at the level of source code. One must dig deeper. And to do it, you must know many things, like assembler, structure of stack frame and so on and so on.
(lrct was stored at ebp-0DCh)
|
|
|
|
|
a_matseevsky wrote: there are problems, which obviously cannot be solved at the level of source code But without the source code it is impossible to understand the structure of the program or why these variables are being allocated the way they are. You keep saying you have given us all the information but all you have shown is two or three lines of assembler without any context, so there is no way we can begin to understand what is happening or why. I must confess I am at a total loss to understand why you refuse to provide the information that we have asked for so we can try and help you.
Veni, vidi, abiit domum
|
|
|
|
|
You cannot help me. How could you help me, if you even do not know, which bytes overwrites fst command? I do not send my procedure only because of its size. And I know now, where the problem is. Well, if you think that the problem is in my code, look at two small pieces and try to understand, why and which one works properly, but the other does not. I fixed the problem (well, not completely- I only rewrote my code, using my knowledge about compiler's incorrect behavior. lrct was removed at all.)
Variant A:
int xmin=xf+(fct*pdpr->m_croprect[Right].left-xf-ddi)*(xl-xf)/(yr1-yr0);
int xmax=xl+(fct*pdpr->m_croprect[Left].right-xl)*(xl-xf)/(yl1-yl0);
h=0.5*pdpr->m_croprect[Left].top+0.5*pdpr->m_croprect[Left].bottom;
pdpr->GetKorrHorz2D(xfirst, xlast, Dv, Dh, h, ddi, yh, yv, yhs, yvs, pKorr, ph, pv);
pdpr->GetCorrLineH2(xfirst, xlast, h, ddi, yh, yv, yh, yv, ph, pv, 0, 0);
yl0=xf+0.5*yh[xf];
yl1=xl+0.5*yh[xl];
yr0=xf+ddi-0.5*yh[xf];
yr1=xl+ddi-0.5*yh[xl];
int xl2=xf+(fct*pdpr->m_croprect[Right].left-xf-ddi)*(xl-xf)/(yr1-yr0);
if(xl2>xmin)
xmin=xl2;
int xr2=xl+(fct*pdpr->m_croprect[Left].right-xl)*(xl-xf)/(yl1-yl0);
if(xmax>xr2)
xmax=xr2;
h=0.1*pdpr->m_croprect[Left].top+0.9*pdpr->m_croprect[Left].bottom;
pdpr->GetKorrHorz2D(xfirst, xlast, Dv, Dh, h, ddi, yh, yv, yhs, yvs, pKorr, ph, pv);
pdpr->GetCorrLineH2(xfirst, xlast, h, ddi, yh, yv, yh, yv, ph, pv, 0, 0);
yl0=xf+0.5*yh[xf];
yl1=xl+0.5*yh[xl];
yr0=xf+ddi-0.5*yh[xf];
yr1=xl+ddi-0.5*yh[xl];
int xl3=xf+(fct*pdpr->m_croprect[Right].left-xf-ddi)*(xl-xf)/(yr1-yr0);
if(xl3>xmin)
xmin=xl3;
int xr3=xl+(fct*pdpr->m_croprect[Left].right-xl)*(xl-xf)/(yl1-yl0);
if(xmax>xr3)
xmax=xr3;
xfirst=xmin+0.02*(xmax-xmin);
xlast=xmax-0.02*(xmax-xmin);
Variant B:
int xl1=xf+(fct*pdpr->m_croprect[Right].left-xf-ddi)*(xl-xf)/(yr1-yr0);
int xr1=xl+(fct*pdpr->m_croprect[Left].right-xl)*(xl-xf)/(yl1-yl0);
h=0.5*pdpr->m_croprect[Left].top+0.5*pdpr->m_croprect[Left].bottom;
pdpr->GetKorrHorz2D(xfirst, xlast, Dv, Dh, h, ddi, yh, yv, yhs, yvs, pKorr, ph, pv);
pdpr->GetCorrLineH2(xfirst, xlast, h, ddi, yh, yv, yh, yv, ph, pv, 0, 0);
yl0=xf+0.5*yh[xf];
yl1=xl+0.5*yh[xl];
yr0=xf+ddi-0.5*yh[xf];
yr1=xl+ddi-0.5*yh[xl];
int xl2=xf+(fct*pdpr->m_croprect[Right].left-xf-ddi)*(xl-xf)/(yr1-yr0);
int xr2=xl+(fct*pdpr->m_croprect[Left].right-xl)*(xl-xf)/(yl1-yl0);
h=0.1*pdpr->m_croprect[Left].top+0.9*pdpr->m_croprect[Left].bottom;
pdpr->GetKorrHorz2D(xfirst, xlast, Dv, Dh, h, ddi, yh, yv, yhs, yvs, pKorr, ph, pv);
pdpr->GetCorrLineH2(xfirst, xlast, h, ddi, yh, yv, yh, yv, ph, pv, 0, 0);
yl0=xf+0.5*yh[xf];
yl1=xl+0.5*yh[xl];
yr0=xf+ddi-0.5*yh[xf];
yr1=xl+ddi-0.5*yh[xl];
int xl3=xf+(fct*pdpr->m_croprect[Right].left-xf-ddi)*(xl-xf)/(yr1-yr0);
int xr3=xl+(fct*pdpr->m_croprect[Left].right-xl)*(xl-xf)/(yl1-yl0);
int xmin=xl1;
int xmax=xr1;
if(xl2>xmin)
xmin=xl2;
if(xl3>xmin)
xmin=xl3;
if(xmax>xr2)
xmax=xr2;
if(xmax>xr3)
xmax=xr3;
xfirst=xmin+0.02*(xmax-xmin);
xlast=xmax-0.02*(xmax-xmin);
|
|
|
|
|
Thank you, all becomes clear.
Veni, vidi, abiit domum
|
|
|
|
|
Here you are. I could you recommend one useful book, written by John Robbins- "Debugging Applications". If you haven't read it, you'll find there many useful things.
|
|
|
|
|
Thanks but I have enough books on my reading list already.
Veni, vidi, abiit domum
|
|
|
|
|
It is only my suggestion. There are few books on debugging, and this one is really useful.
|
|
|
|
|
Richard, you have the patience of a saint.
We can’t stop here, this is bat country - Hunter S Thompson RIP
|
|
|
|
|
Please tell my wife and children.
Veni, vidi, abiit domum
|
|
|
|
|
a_matseevsky wrote: There is no way to demonstrate problem, if compiler is unstable. problem disappeared
Compilers are not "unstable" in the way you are suggesting. If the environment is the same and stable and the compiler is same and the source (and build environment) is the same then excluding so runtime constants such as timestamps and hashes the output is the same.
a_matseevsky wrote: when I replaced max() and min(), but occurred again after some insignificant change in code (which was made far from the procedure, where problem occurred!)....This is the bug in compiler, not mine.
Sorry then but that statement suggests that there is no other possible reason except that you have a pointer bug. Your application is misusing something somewhere.
The nature of max and what it does in terms of integers has not changed in years. One can't do much more in terms of optimizations with it, since for the most part it is the how the integers themselves are located and not the execution that can be optimized (and I only mention that because optimization was the only other explanation.)
However pointer bugs can impact almost anything in the application. And although a pointer bug might cause a problem where the actual pointer is in use it can in fact only show up far from the buggy code. (And I know this from experience not conjecture.)
It also doesn't need to reflect anything in the code that you changed in that it doesn't need to have a pointer in it. What matters is that you changed the execution path (by definition that is what code changes mean) and because of that something that could have been a bug but undetected for months or even years now causes some unexpected failure.
And I want to emphasize again that the pointer bug could be anywhere. The behavior you are seing is a symptom not a cause.
|
|
|
|
|
I can only recommend to you reread the whole discussion. I do know now, where the problem is. OK, compiler is absolutely stable. No problem with it. But it works incorrectly. It reserves some places in a stack for temporary variables. In fact, these variables stores content of co-processor's registers. Some of such temporary variables overlap (partially or completely) another local variables. It might be no problem- some local variables are visible only within some block, not within the whole procedure. If execution of code leaves some block (part of code within such {} brackets), all variables, declared within this block, becomes inaccessible and their place in a stack may be rewritten by another local variable. But compiler creates exe file, which performs this op even when some local variable is visible and accessible!!! And it happens not only with RECT structure, but with some of other local variables too. I saw this process, when I was running release version under debugger. Look up, where I placed piece of my code. Variable "h" was rewritten at least once. If it is not a compiler's bug, I'm definitely an elefant.
|
|
|
|
|
a_matseevsky wrote: But it works incorrectly.
I suggest you reread my post - pointer bugs can have an impact FAR later in the code.
a_matseevsky wrote: And it happens not only with RECT structure, but with some of other local variables too.
Don't know how to state this more clearly.
Either you have a pointer bug or there is a compiler problem. If the latter then reducing the code will demonstrate it AND changing code far from it and unrelated will NOT impact it.
Conversely if the former then you will not be able to reduce it because the code that you are looking at is not the source of the problem.
|
|
|
|
|
There was no pointer bugs at all. I declared no pointers, but the structure-
RECT lrct.
And some local variables too.
Compiler allocated them in a stack and added some temporary variables. In a such way:
_lrct$=-212;
tv5476=-204;
It is a time bomb, which might explode in any time. And it did. That's all. So simple.
|
|
|
|
|
a_matseevsky wrote: I declared no pointers
You have a C++ application and do not use pointers ANYWHERE in the application?
(Again is has NOTHING to do with pointers directly associated with the code where you think the bug is.)
|
|
|
|
|
I do not think where the bug is. I know it. I know exactly where variables was incorrectly allocated in stack and which command overwrote data. What are you talking about pointers? I do use them, of course, but they was not the cause. It is pointless to discuss here what might happens, ignoring all available info.
|
|
|
|
|
a_matseevsky wrote: What are you talking about pointers?
Either you didn't read what I said in my previous replies or didn't understand what I said.
|
|
|
|
|
jschell wrote:
Either you didn't read what I said in my previous replies or didn't understand what I said.
I did. It was you who did not read (or did not understand what he read). Look at quotes from your messages:
I want to emphasize again that the pointer bug could be anywhere. The behavior you are seing is a symptom not a cause.
Don't know how to state this more clearly.
Either you have a pointer bug or there is a compiler problem. If the latter then reducing the code will demonstrate it AND changing code far from it and unrelated will NOT impact it.
Conversely if the former then you will not be able to reduce it because the code that you are looking at is not the source of the problem.
You have a C++ application and do not use pointers ANYWHERE in the application?
(Again is has NOTHING to do with pointers directly associated with the code where you think the bug is.)
There are empty words about what might happens. Nothing common with real situation.
Compiler placed temporary variable too close to another one and when the tv5476 was used, RECT structure was partially overwritten. So simple. It is no pointer's problem- the compiler's one.
|
|
|
|
|