At 2014-04-17 08:05:22,"Jason Garrett-Glaser" <[email protected]> wrote: >On Wed, Apr 16, 2014 at 11:08 AM, Steve Borho <[email protected]> wrote: >> On Wed, Apr 16, 2014 at 12:57 PM, Steve Borho <[email protected]> wrote: >>> On Wed, Apr 16, 2014 at 12:28 PM, Jason Garrett-Glaser <[email protected]> >>> wrote: >>>>> x264 manually aligns their stack (see x264_stack_align()) >>>> >>>> Note that this is because we don't require gcc's >>>> force_arg_align_stack_pointer intrinsic (added in 4.2). The stack >>>> isn't guaranteed to be aligned on thread entry or library entry on >>>> x86_32, so even if gcc is told to keep stack alignment, it can only >>>> keep stack alignment relative to the entry point, hence we have to >>>> manually align the entry point. >>>> >>>> Note that this is important even if you don't use x264asm, because >>>> using alignment intrinsics on stack-allocated arrays does not work if >>>> the entry point hasn't been aligned. >>>> >>>> x264asm's stack alignment /should/ align relative to the values being >>>> pushed onto the stack -- if it isn't, something's probably very wrong. >>>> I'm going to assume it's not broken, since it's used for a good bit of >>>> ffmpeg and x264 code, but if it is broken, poke the x264 mailing list >>>> with assembly code and the (wrong) assembled output. >>> >>> Would it be appropriate to simply add "-mpreferred-stack-boundary=4" >>> for x86_32 compiles if GCC is new enough to support it, or set >>> HAVE_ALIGNED_STACK=0 if it is too old? >> >> Oh, your point about library entry points finally has sunk in. We >> need to either enforce stack alignment internally on all our thread or >> API entry points on x86_32 builds, or we have to set >> HAVE_ALIGNED_STACK=0 unconditionally for those builds. > >That's not correct either. HAVE_ALIGNED_STACK=0 is still not >sufficient -- all your stack-declared aligned arrays will be >misaligned and things may crash anyways. You must enforce stack >alignment at all thread starts and entry points (well, only the ones >that actually lead to asm code -- in x264 this is only like two >places, so it's not a huge deal!). osdep.h in x264 covers stack >alignment for stack-declared arrays which require higher alignment >than the stack's native alignment. > >Note something important: > >1. The way gcc enforces stack alignment for aligned elements on the >stack is to keep the stack aligned between functions, then align >elements on the stack relative to the aligned stack. This fits with >the Linux userspace convention that stacks should be 16-byte aligned >on x86_32. > >2. On ICC (and I presume MSVC), instead, data on the stack is aligned >explicitly, without making assumptions about the stack pointer's >alignment. This is slower, but works without such a convention. > >So this is a non-MS compiler issue only. Note this comes up double >e.g. on ARM, where the compiler does not even support anything above >8-byte alignment, or worse, 4-byte alignment on iOS's ABI. This is why >osdep.h has those workarounds. > >Jason : in our code, we use more than 4 regs in x86 mode, so we have to save some regs onto stack. entry: <-- align stack push ebx push esi push edi <-- stack align broken here user code...
_______________________________________________ x265-devel mailing list [email protected] https://mailman.videolan.org/listinfo/x265-devel
