On 18/08/2021 13:44, Andrew Cooper wrote:
> On 18/08/2021 12:30, Marek Marczykowski-Górecki wrote:
>> set_xcr0() and set_msr_xss() use cached value to avoid setting the
>> register to the same value over and over. But suspend/resume implicitly
>> reset the registers and since percpu areas are not deallocated on
>> suspend anymore, the cache gets stale.
>> Reset the cache on resume, to ensure the next write will really hit the
>> hardware. Choose value 0, as it will never be a legitimate write to
>> those registers - and so, will force write (and cache update).
>>
>> Note the cache is used io get_xcr0() and get_msr_xss() too, but:
>> - set_xcr0() is called few lines below in xstate_init(), so it will
>>   update the cache with appropriate value
>> - get_msr_xss() is not used anywhere - and thus not before any
>>   set_msr_xss() that will fill the cache
>>
>> Fixes: aca2a985a55a "xen: don't free percpu areas during suspend"
>> Signed-off-by: Marek Marczykowski-Górecki <[email protected]>
> I'd prefer to do this differently.  As I said in the thread, there are
> other registers such as MSR_TSC_AUX which fall into the same category,
> and I'd like to make something which works systematically.

Ok - after some searching, I think we have problems with:

cpu/common.c:47:DEFINE_PER_CPU(struct cpuidmasks, cpuidmasks);
cpu/common.c:120:static DEFINE_PER_CPU(uint64_t, msr_misc_features);
msr.c:35:DEFINE_PER_CPU(uint32_t, tsc_aux);
xstate.c:36:static DEFINE_PER_CPU(uint64_t, xcr0);
xstate.c:79:static DEFINE_PER_CPU(uint64_t, xss);

There is also:

traps.c:100:DEFINE_PER_CPU(uint64_t, efer);

which we *almost* handle correctly, but fail to update the cache on the
BSP out of S3.


For the APIC, I think we have issues with:

irq.c:1083:static DEFINE_PER_CPU(struct pending_eoi,
pending_eoi[NR_DYNAMIC_VECTORS]);

because we don't defer S3 until all pending EOIs are complete.


I gave up trying to figure out the MCE or power governor logic so we may
have additional issues there.


Additionally,

flushtlb.c:34:DEFINE_PER_CPU(u32, tlbflush_time);

really does need setting appropriately, although I think the only
fallout is a few unnecessary TLB flushes.

~Andrew


Reply via email to