On 17/03/2025 9:09 am, Jan Beulich wrote: > On 14.03.2025 21:49, Andrew Cooper wrote: >> ... which is more consise than the opencoded form, and more efficient when >> compiled. >> >> For production VMs, ~100% of emulations are simple MOVs, so it is likely that >> there are no segments to write back. >> >> Furthermore, now that find_{first,next}_bit() are no longer in use, the >> seg_reg_{accessed,dirty} fields aren't forced to be unsigned long, although >> they do need to remain unsigned int because of __set_bit() elsewhere. >> >> No functional change. >> >> Signed-off-by: Andrew Cooper <andrew.coop...@citrix.com> > Reviewed-by: Jan Beulich <jbeul...@suse.com> > >> I still can't persuade GCC to do the early exit prior to establishing the >> stack frame, and unlike do_livepatch_work(), it's not critical enough to >> require noinline games. > Then is ... > >> --- a/xen/arch/x86/hvm/emulate.c >> +++ b/xen/arch/x86/hvm/emulate.c >> @@ -3022,18 +3022,16 @@ void hvm_emulate_init_per_insn( >> void hvm_emulate_writeback( >> struct hvm_emulate_ctxt *hvmemul_ctxt) >> { >> - enum x86_segment seg; >> + struct vcpu *curr; >> + unsigned int dirty = hvmemul_ctxt->seg_reg_dirty; >> >> - seg = find_first_bit(&hvmemul_ctxt->seg_reg_dirty, >> - ARRAY_SIZE(hvmemul_ctxt->seg_reg)); >> + if ( likely(!dirty) ) >> + return; > ... this worthwhile at all? I'm surprised anyway that I see you use likely() > here, when generally you argue against its use.
No, it's not worth it. In fact, simplifying makes the function smaller. void hvm_emulate_writeback( struct hvm_emulate_ctxt *hvmemul_ctxt) { struct vcpu *curr = current; unsigned int dirty = hvmemul_ctxt->seg_reg_dirty; for_each_set_bit ( seg, dirty ) hvm_set_segment_register(curr, seg, &hvmemul_ctxt->seg_reg[seg]); } gets a bloat-o-meter score of 131 down to 72 (-59). Are you happy for your R-by to stand, given this adjustment? ~Andrew