> From: Jan Beulich <jbeul...@suse.com>
> Sent: Tuesday, October 20, 2020 4:10 PM
> 
> On 19.10.2020 18:12, Andrew Cooper wrote:
> > On 19/10/2020 10:09, Jan Beulich wrote:
> >> On 16.10.2020 17:38, Andrew Cooper wrote:
> >>> On 15/10/2020 09:01, Jan Beulich wrote:
> >>>> On 14.10.2020 15:57, Andrew Cooper wrote:
> >>>>> Running with corrupt state is every bit an XSA as hitting a VMEntry
> >>>>> failure if it can be triggered by userspace, but the latter safer and
> >>>>> much more obvious.
> >>>> I disagree. For CPL > 0 we don't "corrupt" guest state any more
> >>>> than reporting a #GP fault when one is going to be reported
> >>>> anyway (as long as the VM entry doesn't fail, and hence the
> >>>> guest won't get crashed). IOW this raising of #GP actually is a
> >>>> precautionary measure to _avoid_ XSAs.
> >>> It does not remove any XSAs.  It merely hides them.
> >> How that? If we convert the ability of guest user mode to crash
> >> the guest into deliver of #GP(0), how is there a hidden XSA then?
> >
> > Because userspace being able to triggering this fixup is still an XSA.
> 
> How do you know without a specific case at hand? It may well be
> that all that's impacted is guest user space, in which case I
> don't see why there would need to be an XSA. It's still a bug
> then, sure.
> 
> >>>>> It was the appropriate security fix (give or take the functional bug in
> >>>>> it) at the time, given the complexity of retrofitting zero length
> >>>>> instruction fetches to the emulator.
> >>>>>
> >>>>> However, it is one of a very long list of guest-state-induced VMEntry
> >>>>> failures, with non-trivial logic which we assert will pass, on a
> >>>>> fastpath, where hardware also performs the same checks and we
> already
> >>>>> have a runtime safe way of dealing with errors.  (Hence not actually
> >>>>> using ASSERT_UNREACHABLE() here.)
> >>>> "Runtime safe" as far as Xen is concerned, I take it. This isn't safe
> >>>> for the guest at all, as vmx_failed_vmentry() results in an
> >>>> unconditional domain_crash().
> >>> Any VMEntry failure is a bug in Xen.  If userspace can trigger it, it is
> >>> an XSA, *irrespective* of whether we crash the domain then and there,
> or
> >>> whether we let it try and limp on with corrupted state.
> >> Allowing the guest to continue with corrupted state is not a
> >> useful thing to do, I agree. However, what falls under
> >> "corrupted" seems to be different for you and me. I'd not call
> >> delivery of #GP "corruption" in any way.
> >
> > I can only repeat my previous statement:
> >
> >> There are legal states where RIP is 0x0000800000000000 and #GP is the
> >> wrong thing to do.
> >
> > Blindly raising #GP in is not always the right thing to do.
> 
> Again - we're in agreement about "blindly". Let's be less blind.
> 
> >>  The primary goal ought
> >> to be that we don't corrupt the guest kernel view of the world.
> >> It may then have the opportunity to kill the offending user
> >> mode process.
> >
> > By the time we have hit this case, all bets are off, because Xen *is*
> > malfunctioning.  We have no idea if kernel context is still intact.  You
> > don't even know that current user context is the correct offending
> > context to clobber, and might be creating a user=>user DoS vulnerability.
> >
> > We definitely have an XSA to find and fix, and we can either make it
> > very obvious and likely to be reported, or hidden and liable to go
> > unnoticed for a long period of time.
> 
> Why would it go unnoticed when we log the incident? I very much
> hope people inspect their logs at least every once in a while ...
> 
> And as per above - I disagree with your use of "definitely" here.
> We have a bug to analyze and fix, yes. Whether it's an XSA-worthy
> one isn't known ahead of time, unless we crash the guest in such
> a case.
> 
> In any event I think it's about time that the VMX maintainers
> voice their views here, as they're the ones to approve of
> whichever change we end up with. Kevin, Jun?
> 

Honestly speaking both of your options make some sense. and
I don't think there is a perfect answer here. Personally I'm more
aligned with Jan's point on preventing guest user from crashing
the whole domain. But let's also hear from others' opinions as 
I believe this dilemma might be seen in other places too thus 
may need a general consensus in Xen community.

Thanks
Kevin

Reply via email to