On Mon, Feb 14, 2022 at 10:24:49AM +0100, Jan Beulich wrote: > Calibration logic assumes that the platform timer (HPET or ACPI PM > timer) and the TSC are read at about the same time. This assumption may > not hold when a long latency event (e.g. SMI or NMI) occurs between the > two reads. Reduce the risk of reading uncorrelated values by doing at > least four pairs of reads, using the tuple where the delta between the > enclosing TSC reads was smallest. From the fourth iteration onwards bail > if the new TSC delta isn't better (smaller) than the best earlier one. > > Signed-off-by: Jan Beulich <jbeul...@suse.com>
Reviewed-by: Roger Pau Monné <roger....@citrix.com> > --- > When running virtualized, scheduling in the host would also constitute > long latency events. I wonder whether, to compensate for that, we'd want > more than 3 "base" iterations, as I would expect scheduling events to > occur more frequently than e.g. SMI (and with a higher probability of > multiple ones occurring in close succession). That's hard to tell, maybe we should make the base iteration count settable from the command line? > --- > v3: Fix 24-bit PM timer wrapping between the two read_pt_and_tsc() > invocations. > v2: Use helper functions to fold duplicate code. > > --- a/xen/arch/x86/time.c > +++ b/xen/arch/x86/time.c > @@ -287,9 +287,47 @@ static char *freq_string(u64 freq) > return s; > } > > -static uint64_t adjust_elapsed(uint64_t elapsed, uint32_t actual, > - uint32_t target) > +static uint32_t __init read_pt_and_tsc(uint64_t *tsc, > + const struct platform_timesource *pts) > { > + uint64_t tsc_prev = *tsc = rdtsc_ordered(), tsc_min = ~0; > + uint32_t best = best; > + unsigned int i; > + > + for ( i = 0; ; ++i ) > + { > + uint32_t pt = pts->read_counter(); > + uint64_t tsc_cur = rdtsc_ordered(); > + uint64_t tsc_delta = tsc_cur - tsc_prev; > + > + if ( tsc_delta < tsc_min ) > + { > + tsc_min = tsc_delta; > + *tsc = tsc_cur; > + best = pt; > + } > + else if ( i > 2 ) > + break; > + > + tsc_prev = tsc_cur; > + } > + > + return best; > +} > + > +static uint64_t __init calibrate_tsc(const struct platform_timesource *pts) > +{ > + uint64_t start, end, elapsed; > + unsigned int count = read_pt_and_tsc(&start, pts); > + unsigned int target = CALIBRATE_VALUE(pts->frequency), actual; > + unsigned int mask = (uint32_t)~0 >> (32 - pts->counter_bits); Just to be on the safe side you might want to add an assert that counter_bits <= 32. Thanks, Roger.