Ulrich Weigand wrote:
> 
> Eric Pouech wrote:
> 
> > the only notable difference (NTSTATUS ret get optimized in register variable %eax)
> > is twice the "add    $0xfffffff4,%esp" in the buggy case
> > this explain the stack trashing (-12 on stack can give bad results) and also
> > why it was working with +relay (relay adds another call level, hence the stack
> > trashing happened on the relay code, not on the caller...)
> 
> Why is this a bug?  Wasting another 12 bytes of stack space shouln't hurt
> anything (I mean, the first 12 bytes aren't needed either, nor are the
> 20 bytes stack frame ...).
hmm did I forget that stack grew *down* ?? sounds so ;-)

> gcc 2.95.2 generates the same code.  By the way, all this wasting of stack
> space is due to a default setting in the i386 backend that wants to keep
> the stack pointer aligned at a 16-byte boundary at every function call.
> I have no idea why this is desirable :-/  
from reading the doc, 
>      The stack is required to be aligned on a 4 byte boundary.  On
>      Pentium and PentiumPro, `double' and `long double' values should be
>      aligned to an 8 byte boundary (see `-malign-double') or suffer
>      significant run time performance penalties.  On Pentium III, the
>      Streaming SIMD Extention (SSE) data type `__m128' suffers similar
>      penalties if it is not 16 byte aligned.
so, it seems to be for performance reasons on latest CPUs

>In any case, it gets the alignment
> wrong in the case of nested function calls.  (The code isn't buggy, it just
> doesn't maintain that alignment.)
> 
> You can switch this alignment behaviour off using the
>    -mpreferred-stack-boundary=2
> option.  This will omit *all* these stack pointer manipulations
> in the RegCloseKey case.
well, thanks for the hint, but:
compiling the "old" code (nested func call on return, recompiled only
 memory/registry.c) with:
-mpreferred-stack-boundary=2 is working
-mpreferred-stack-boundary=3 exhibits the bug again
-mpreferred-stack-boundary=4 exhibits the bug again (of course that's the 
default value)

My first interpretation was wrong, but this stack alignment issue plays a role
in this bug... I don't have a clear idea why...

A+
-- 
---------------
Eric Pouech (http://perso.wanadoo.fr/eric.pouech/)
"The future will be better tomorrow", Vice President Dan Quayle

Reply via email to