At 2014-04-18 03:31:38,"Jason Garrett-Glaser" <[email protected]> wrote: >On Wed, Apr 16, 2014 at 8:50 PM, chen <[email protected]> wrote: >> >> At 2014-04-17 08:05:22,"Jason Garrett-Glaser" <[email protected]> wrote: >>>On Wed, Apr 16, 2014 at 11:08 AM, Steve Borho <[email protected]> wrote: >>>> On Wed, Apr 16, 2014 at 12:57 PM, Steve Borho <[email protected]> wrote: >>>>> On Wed, Apr 16, 2014 at 12:28 PM, Jason Garrett-Glaser <[email protected]> >>>>> wrote: >>>>>>> x264 manually aligns their stack (see x264_stack_align()) >>>>>> >>>>>> Note that this is because we don't require gcc's >>>>>> force_arg_align_stack_pointer intrinsic (added in 4.2). The stack >>>>>> isn't guaranteed to be aligned on thread entry or library entry on >>>>>> x86_32, so even if gcc is told to keep stack alignment, it can only >>>>>> keep stack alignment relative to the entry point, hence we have to >>>>>> manually align the entry point. >>>>>> >>>>>> Note that this is important even if you don't use x264asm, because >>>>>> using alignment intrinsics on stack-allocated arrays does not work if >>>>>> the entry point hasn't been aligned. >>>>>> >>>>>> x264asm's stack alignment /should/ align relative to the values being >>>>>> pushed onto the stack -- if it isn't, something's probably very wrong. >>>>>> I'm going to assume it's not broken, since it's used for a good bit of >>>>>> ffmpeg and x264 code, but if it is broken, poke the x264 mailing list >>>>>> with assembly code and the (wrong) assembled output. >>>>> >>>>> Would it be appropriate to simply add "-mpreferred-stack-boundary=4" >>>>> for x86_32 compiles if GCC is new enough to support it, or set >>>>> HAVE_ALIGNED_STACK=0 if it is too old? >>>> >>>> Oh, your point about library entry points finally has sunk in. We >>>> need to either enforce stack alignment internally on all our thread or >>>> API entry points on x86_32 builds, or we have to set >>>> HAVE_ALIGNED_STACK=0 unconditionally for those builds. >>> >>>That's not correct either. HAVE_ALIGNED_STACK=0 is still not >>>sufficient -- all your stack-declared aligned arrays will be >>>misaligned and things may crash anyways. You must enforce stack >>>alignment at all thread starts and entry points (well, only the ones >>>that actually lead to asm code -- in x264 this is only like two >>>places, so it's not a huge deal!). osdep.h in x264 covers stack >>>alignment for stack-declared arrays which require higher alignment >>>than the stack's native alignment. >>> >>>Note something important: >>> >>>1. The way gcc enforces stack alignment for aligned elements on the >>>stack is to keep the stack aligned between functions, then align >>>elements on the stack relative to the aligned stack. This fits with >>>the Linux userspace convention that stacks should be 16-byte aligned >>>on x86_32. >>> >>>2. On ICC (and I presume MSVC), instead, data on the stack is aligned >>>explicitly, without making assumptions about the stack pointer's >>>alignment. This is slower, but works without such a convention. >>> >>>So this is a non-MS compiler issue only. Note this comes up double >>>e.g. on ARM, where the compiler does not even support anything above >>>8-byte alignment, or worse, 4-byte alignment on iOS's ABI. This is why >>>osdep.h has those workarounds. >>> >>>Jason >> : >> >> in our code, we use more than 4 regs in x86 mode, so we have to save some >> regs onto stack. > >x86inc handles this already when it references values in the stack. >If it didn't, none of x264's asm would function. > yes, for value only it didn't align stack for reserve stack space (4th parameters of cglobal)
_______________________________________________ x265-devel mailing list [email protected] https://mailman.videolan.org/listinfo/x265-devel
