Eric Pouech wrote:

> the only notable difference (NTSTATUS ret get optimized in register variable %eax)
> is twice the "add    $0xfffffff4,%esp" in the buggy case
> this explain the stack trashing (-12 on stack can give bad results) and also
> why it was working with +relay (relay adds another call level, hence the stack
> trashing happened on the relay code, not on the caller...)

Why is this a bug?  Wasting another 12 bytes of stack space shouln't hurt
anything (I mean, the first 12 bytes aren't needed either, nor are the
20 bytes stack frame ...).

At the end of the routine, nobody relies on the stack pointer, but all 
registers are restored via %ebp, which is OK in both cases.

> I'm using gcc 2.95.1. Could folks with some other GCC versions check what they get
> there ? The simple fix would be not to nest the two calls and use an intermediate
> variable, but let's wait for the testings on other GCC version...

gcc 2.95.2 generates the same code.  By the way, all this wasting of stack
space is due to a default setting in the i386 backend that wants to keep
the stack pointer aligned at a 16-byte boundary at every function call.  
I have no idea why this is desirable :-/  In any case, it gets the alignment 
wrong in the case of nested function calls.  (The code isn't buggy, it just 
doesn't maintain that alignment.)

You can switch this alignment behaviour off using the 
   -mpreferred-stack-boundary=2
option.  This will omit *all* these stack pointer manipulations
in the RegCloseKey case.

Bye,
Ulrich


-- 
  Dr. Ulrich Weigand
  [EMAIL PROTECTED]

Reply via email to