Ping?

On Thu, Jun 09, 2022 at 12:09:18PM +0200, Roger Pau Monné wrote:
> On Thu, Jun 09, 2022 at 03:39:33PM +0800, Xiaoyao Li wrote:
> > On 6/9/2022 3:04 PM, Tian, Kevin wrote:
> > > +Chenyi/Xiaoyao who worked on the KVM support. Presumably
> > > similar opens have been discussed in KVM hence they have the
> > > right background to comment here.
> > > 
> > > > From: Roger Pau Monne <[email protected]>
> > > > Sent: Thursday, May 26, 2022 7:12 PM
> > > > 
> > > > Under certain conditions guests can get the CPU stuck in an unbounded
> > > > loop without the possibility of an interrupt window to occur on
> > > > instruction boundary.  This was the case with the scenarios described
> > > > in XSA-156.
> > > > 
> > > > Make use of the Notify VM Exit mechanism, that will trigger a VM Exit
> > > > if no interrupt window occurs for a specified amount of time.  Note
> > > > that using the Notify VM Exit avoids having to trap #AC and #DB
> > > > exceptions, as Xen is guaranteed to get a VM Exit even if the guest
> > > > puts the CPU in a loop without an interrupt window, as such disable
> > > > the intercepts if the feature is available and enabled.
> > > > 
> > > > Setting the notify VM exit window to 0 is safe because there's a
> > > > threshold added by the hardware in order to have a sane window value.
> > > > 
> > > > Suggested-by: Andrew Cooper <[email protected]>
> > > > Signed-off-by: Roger Pau Monné <[email protected]>
> > > > ---
> > > > Changes since v1:
> > > >   - Properly update debug state when using notify VM exit.
> > > >   - Reword commit message.
> > > > ---
> > > > This change enables the notify VM exit by default, KVM however doesn't
> > > > seem to enable it by default, and there's the following note in the
> > > > commit message:
> > > > 
> > > > "- There's a possibility, however small, that a notify VM exit happens
> > > >     with VM_CONTEXT_INVALID set in exit qualification. In this case, the
> > > >     vcpu can no longer run. To avoid killing a well-behaved guest, set
> > > >     notify window as -1 to disable this feature by default."
> > > > 
> > > > It's not obviously clear to me whether the comment was meant to be:
> > > > "There's a possibility, however small, that a notify VM exit _wrongly_
> > > > happens with VM_CONTEXT_INVALID".
> > > > 
> > > > It's also not clear whether such wrong hardware behavior only affects
> > > > a specific set of hardware,
> > 
> > I'm not sure what you mean for a specific set of hardware.
> > 
> > We make it default off in KVM just in case that future silicon wrongly sets
> > VM_CONTEXT_INVALID bit. Becuase we make the policy that VM cannot continue
> > running in that case.
> > 
> > For the worst case, if some future silicon happens to have this kind silly
> > bug, then the existing product kernel all suffer the possibility that their
> > VM being killed due to the feature is default on.
> 
> That's IMO a weird policy.  If there's such behavior in any hardware
> platform I would assume Intel would issue an errata, and then we would
> just avoid using the feature on affected hardware (like we do with
> other hardware features when they have erratas).
> 
> If we applied the same logic to all new Intel features we won't use
> any of them.  At least in Xen there are already combinations of vmexit
> conditions that will lead to the guest being killed.
> 
> > > > in a way that we could avoid enabling
> > > > notify VM exit there.
> > > > 
> > > > There's a discussion in one of the Linux patches that 128K might be
> > > > the safer value in order to prevent false positives, but I have no
> > > > formal confirmation about this.  Maybe our Intel maintainers can
> > > > provide some more feedback on a suitable notify VM exit window
> > > > value.
> > 
> > The 128k is the internal threshold for SPR silicon. The internal threshold
> > is tuned by Intel for each silicon, to make sure it's big enough to avoid
> > false positive even when user set vmcs.notify_window to 0.
> > 
> > However, it varies for different processor generations.
> > 
> > What is the suitable value is hard to say, it depends on how soon does VMM
> > want to intercept the VM. Anyway, Intel ensures that even value 0 is safe.
> 
> Ideally we need a fixed default value that's guaranteed to work on all
> possible hardware that supports the feature, or alternatively a way to
> calculate a sane default window based on the hardware platform.
> 
> Could we get some wording added to the ISE regarding 0 being a
> suitable default value to use because hardware will add a threshold
> internally to make the value safe?
> 
> Thanks, Roger.
> 

Reply via email to