On Mon, Feb 14, 2022 at 10:24:49AM +0100, Jan Beulich wrote:
> Calibration logic assumes that the platform timer (HPET or ACPI PM
> timer) and the TSC are read at about the same time. This assumption may
> not hold when a long latency event (e.g. SMI or NMI) occurs between the
> two reads. Reduce the risk of reading uncorrelated values by doing at
> least four pairs of reads, using the tuple where the delta between the
> enclosing TSC reads was smallest. From the fourth iteration onwards bail
> if the new TSC delta isn't better (smaller) than the best earlier one.
> 
> Signed-off-by: Jan Beulich <jbeul...@suse.com>

Reviewed-by: Roger Pau Monné <roger....@citrix.com>

> ---
> When running virtualized, scheduling in the host would also constitute
> long latency events. I wonder whether, to compensate for that, we'd want
> more than 3 "base" iterations, as I would expect scheduling events to
> occur more frequently than e.g. SMI (and with a higher probability of
> multiple ones occurring in close succession).

That's hard to tell, maybe we should make the base iteration count
settable from the command line?

> ---
> v3: Fix 24-bit PM timer wrapping between the two read_pt_and_tsc()
>     invocations.
> v2: Use helper functions to fold duplicate code.
> 
> --- a/xen/arch/x86/time.c
> +++ b/xen/arch/x86/time.c
> @@ -287,9 +287,47 @@ static char *freq_string(u64 freq)
>      return s;
>  }
>  
> -static uint64_t adjust_elapsed(uint64_t elapsed, uint32_t actual,
> -                               uint32_t target)
> +static uint32_t __init read_pt_and_tsc(uint64_t *tsc,
> +                                       const struct platform_timesource *pts)
>  {
> +    uint64_t tsc_prev = *tsc = rdtsc_ordered(), tsc_min = ~0;
> +    uint32_t best = best;
> +    unsigned int i;
> +
> +    for ( i = 0; ; ++i )
> +    {
> +        uint32_t pt = pts->read_counter();
> +        uint64_t tsc_cur = rdtsc_ordered();
> +        uint64_t tsc_delta = tsc_cur - tsc_prev;
> +
> +        if ( tsc_delta < tsc_min )
> +        {
> +            tsc_min = tsc_delta;
> +            *tsc = tsc_cur;
> +            best = pt;
> +        }
> +        else if ( i > 2 )
> +            break;
> +
> +        tsc_prev = tsc_cur;
> +    }
> +
> +    return best;
> +}
> +
> +static uint64_t __init calibrate_tsc(const struct platform_timesource *pts)
> +{
> +    uint64_t start, end, elapsed;
> +    unsigned int count = read_pt_and_tsc(&start, pts);
> +    unsigned int target = CALIBRATE_VALUE(pts->frequency), actual;
> +    unsigned int mask = (uint32_t)~0 >> (32 - pts->counter_bits);

Just to be on the safe side you might want to add an assert that
counter_bits <= 32.

Thanks, Roger.

Reply via email to