On 18/07/2019 13:10, Jan Beulich wrote:
> From: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
>
> Programming MTRR registers in multi-processor systems is a rather lengthy
> process. Furthermore, all processors must program these registers in lock
> step and with interrupts disabled; the process also involves flushing
> caches and TLBs twice. As a result, the process may take a considerable
> amount of time.
>
> On some platforms, this can lead to a large skew of the refined-jiffies
> clock source. Early when booting, if no other clock is available (e.g.,
> booting with hpet=disabled), the refined-jiffies clock source is used to
> monitor the TSC clock source. If the skew of refined-jiffies is too large,
> Linux wrongly assumes that the TSC is unstable:
>
>    clocksource: timekeeping watchdog on CPU1: Marking clocksource
>                 'tsc-early' as unstable because the skew is too large:
>    clocksource: 'refined-jiffies' wd_now: fffedc10 wd_last:
>                 fffedb90 mask: ffffffff
>    clocksource: 'tsc-early' cs_now: 5eccfddebc cs_last: 5e7e3303d4
>                 mask: ffffffffffffffff
>    tsc: Marking TSC unstable due to clocksource watchdog
>
> As per measurements, around 98% of the time needed by the procedure to
> program MTRRs in multi-processor systems is spent flushing caches with
> wbinvd(). As per the Section 11.11.8 of the Intel 64 and IA 32
> Architectures Software Developer's Manual, it is not necessary to flush
> caches if the CPU supports cache self-snooping. Thus, skipping the cache
> flushes can reduce by several tens of milliseconds the time needed to
> complete the programming of the MTRR registers:
>
> Platform                              Before     After
> 104-core (208 Threads) Skylake  1437ms      28ms
>    2-core (  4 Threads) Haswell   114ms       2ms
>
> Reported-by: Mohammad Etemadi <mohammad.etem...@intel.com>
> Signed-off-by: Ricardo Neri <ricardo.neri-calde...@linux.intel.com>
> [Linux commit fd329f276ecaad7a371d6f91b9bbea031d0c3440]
>
> Use alternatives patching instead of static_cpu_has() (which we don't
> have [yet]).
> Interestingly we've been lacking the 2nd wbinvd(), which I'm taking the
> liberty here.
>
> Requested-by: Andrew Cooper <andrew.coop...@citrix.com>
> Signed-off-by: Jan Beulich <jbeul...@suse.com>

Reviewed-by: Andrew Cooper <andrew.coop...@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Reply via email to