On Wed, Feb 14, 2024 at 03:53:24PM +0000, Andrew Cooper wrote:
> On 14/02/2024 3:29 pm, Roger Pau Monné wrote:
> > On Wed, Feb 14, 2024 at 04:08:12PM +0100, Jan Beulich wrote:
> >> On 14.02.2024 16:02, Roger Pau Monné wrote:
> >>> On Wed, Feb 14, 2024 at 10:35:58AM +0000, Frediano Ziglio wrote:
> >>>> We just pushed a 8-bytes zero and exception constants are
> >>>> small so we can just write a single byte saving 3 bytes for
> >>>> instruction.
> >>>> With ENDBR64 this reduces the size of many entry points from 32 to
> >>>> 16 bytes (due to alignment).
> >>>> Similar code is already used in autogen_stubs.
> >>> Will using movb instead of movl have any performance impact?  I don't
> >>> think we should trade speed for code size, so this needs to be
> >>> mentioned in the commit message.
> >> That's really what the last sentence is about (it could have been said
> >> more explicitly though): If doing so on interrupt paths is fine, it
> >> ought to be fine on exception paths as well.
> > I might view it the other way around: maybe it's autogen_stubs that
> > needs changing to use movl instead of movb for performance reasons?
> >
> > I think this needs to be clearly stated, and ideally some kind of
> > benchmarks should be provided to demonstrate no performance change if
> > there are doubts whether movl and movb might perform differently.
> 
> The push and the mov are overlapping stores either way.  Swapping
> between movl and movb will make no difference at all.
> 
> However, the shorter instruction ends up halving the size of the entry
> stub when alignment is considered, and that will make a marginal
> difference.  Fewer cache misses (to a first approximation, even #PF will
> be L1-cold), and better utilisation of branch prediction resource (~>
> less likely to be BP-cold).
> 
> I doubt you'll be able to see a difference without perf counters
> (whatever difference is covered here will be dwarfed by the speculation
> workarounds), but a marginal win is still a win.

I'm happy just stating in the commit message that the change doesn't
make any performance difference.

Thanks, Roger.

Reply via email to