On Wed, Apr 16, 2014 at 8:50 PM, chen <[email protected]> wrote: > > At 2014-04-17 08:05:22,"Jason Garrett-Glaser" <[email protected]> wrote: >>On Wed, Apr 16, 2014 at 11:08 AM, Steve Borho <[email protected]> wrote: >>> On Wed, Apr 16, 2014 at 12:57 PM, Steve Borho <[email protected]> wrote: >>>> On Wed, Apr 16, 2014 at 12:28 PM, Jason Garrett-Glaser <[email protected]> >>>> wrote: >>>>>> x264 manually aligns their stack (see x264_stack_align()) >>>>> >>>>> Note that this is because we don't require gcc's >>>>> force_arg_align_stack_pointer intrinsic (added in 4.2). The stack >>>>> isn't guaranteed to be aligned on thread entry or library entry on >>>>> x86_32, so even if gcc is told to keep stack alignment, it can only >>>>> keep stack alignment relative to the entry point, hence we have to >>>>> manually align the entry point. >>>>> >>>>> Note that this is important even if you don't use x264asm, because >>>>> using alignment intrinsics on stack-allocated arrays does not work if >>>>> the entry point hasn't been aligned. >>>>> >>>>> x264asm's stack alignment /should/ align relative to the values being >>>>> pushed onto the stack -- if it isn't, something's probably very wrong. >>>>> I'm going to assume it's not broken, since it's used for a good bit of >>>>> ffmpeg and x264 code, but if it is broken, poke the x264 mailing list >>>>> with assembly code and the (wrong) assembled output. >>>> >>>> Would it be appropriate to simply add "-mpreferred-stack-boundary=4" >>>> for x86_32 compiles if GCC is new enough to support it, or set >>>> HAVE_ALIGNED_STACK=0 if it is too old? >>> >>> Oh, your point about library entry points finally has sunk in. We >>> need to either enforce stack alignment internally on all our thread or >>> API entry points on x86_32 builds, or we have to set >>> HAVE_ALIGNED_STACK=0 unconditionally for those builds. >> >>That's not correct either. HAVE_ALIGNED_STACK=0 is still not >>sufficient -- all your stack-declared aligned arrays will be >>misaligned and things may crash anyways. You must enforce stack >>alignment at all thread starts and entry points (well, only the ones >>that actually lead to asm code -- in x264 this is only like two >>places, so it's not a huge deal!). osdep.h in x264 covers stack >>alignment for stack-declared arrays which require higher alignment >>than the stack's native alignment. >> >>Note something important: >> >>1. The way gcc enforces stack alignment for aligned elements on the >>stack is to keep the stack aligned between functions, then align >>elements on the stack relative to the aligned stack. This fits with >>the Linux userspace convention that stacks should be 16-byte aligned >>on x86_32. >> >>2. On ICC (and I presume MSVC), instead, data on the stack is aligned >>explicitly, without making assumptions about the stack pointer's >>alignment. This is slower, but works without such a convention. >> >>So this is a non-MS compiler issue only. Note this comes up double >>e.g. on ARM, where the compiler does not even support anything above >>8-byte alignment, or worse, 4-byte alignment on iOS's ABI. This is why >>osdep.h has those workarounds. >> >>Jason > : > > in our code, we use more than 4 regs in x86 mode, so we have to save some > regs onto stack.
x86inc handles this already when it references values in the stack. If it didn't, none of x264's asm would function. Jason _______________________________________________ x265-devel mailing list [email protected] https://mailman.videolan.org/listinfo/x265-devel
