On Thu, Jan 27, 2022 at 04:13:47PM +0100, Jan Beulich wrote: > While we don't want to skip calling update_idle_stats(), arrange for it > to not increment the overall time spent in the state we didn't really > enter. > > Signed-off-by: Jan Beulich <jbeul...@suse.com> > --- > RFC: If we wanted to also move the tracing, then I think the part ahead > of the if() also would need moving. At that point we could as well > move update_last_cx_stat(), too, which afaict would allow skipping > update_idle_stats() on the "else" path (which therefore would go > away). Yet then, with the setting of power->safe_state moved up a > little (which imo it should have been anyway) the two > cpu_is_haltable() invocations would only have the lapic_timer_off() > invocation left in between. This would then seem to call for simply > ditching the 2nd one - acpi-idle also doesn't have a 2nd instance.
It's possible for lapic_timer_off to take a non-trivial amount of time when virtualized, but it's likely we won't be using mwait in that case, so not sure it matter much to have the two cpu_is_haltable calls if there's just a lapic_timer_off between them. > TBD: For the tracing I wonder if that really needs to come ahead of the > local_irq_enable(). Maybe trace_exit_reason() needs to, but quite > certainly TRACE_6D() doesn't. Would be good if it could be moved after the local_irq_enable call, as it's not as trivial as I've expected, and will just add latency to any pending interrupt waiting to be serviced. FWIW, I haven't spotted a need to call it with interrupt disabled. > --- > v3: Also move cstate_restore_tsc() invocation and split ones to > update_idle_stats(). > v2: New. > > --- a/xen/arch/x86/cpu/mwait-idle.c > +++ b/xen/arch/x86/cpu/mwait-idle.c > @@ -854,17 +854,23 @@ static void mwait_idle(void) > mwait_idle_with_hints(cx->address, MWAIT_ECX_INTERRUPT_BREAK); > > local_irq_disable(); > - } > > - after = alternative_call(cpuidle_get_tick); > + after = alternative_call(cpuidle_get_tick); > + > + cstate_restore_tsc(); > + > + /* Now back in C0. */ > + update_idle_stats(power, cx, before, after); > + } else { > + /* Never left C0. */ > + after = alternative_call(cpuidle_get_tick); > + update_idle_stats(power, cx, after, after); While adjusting this, could you also modify update_idle_stats to avoid increasing cx->usage if before == after (or !sleep_ticks). I don't think it's fine to increase the state counter if we never actually entered it. I was also going to suggest that you don't set 'after' and just use update_idle_stats(power, cx, before, before); but seeing as TRACE_6D also makes use of 'after' there's not much point without further rework. I also see the RFC note at the top, so while I think this is an improvement, I agree it would be nice to avoid the trace altogether if we never actually enter the state. If you want to rework the patch or send a followup that's fine for me. Thanks, Roger.