On 07/12/2022 09:41, Jan Beulich wrote: > On 06.12.2022 18:55, Demi Marie Obenour wrote: >> On Tue, Dec 06, 2022 at 12:06:24PM +0000, Andrew Cooper wrote: >>> On 06/12/2022 04:33, Demi Marie Obenour wrote: >>>> @@ -961,13 +1000,24 @@ get_page_from_l1e( >>>> >>>> switch ( l1f & PAGE_CACHE_ATTRS ) >>>> { >>>> - case _PAGE_WB: >>>> + default: >>>> +#ifndef NDEBUG >>>> + printk(XENLOG_G_WARNING >>>> + "d%d: Guest tried to use bad cachability attribute %u >>>> for MFN %lx\n", >>>> + l1e_owner->domain_id, l1f & PAGE_CACHE_ATTRS, mfn); >>> %pd. You absolutely want to convert the PTE bits to a PAT value before >>> priniting (decimal on a PTE value is useless), and %PRI_mfn. >> I’ll have to look at the rest of the Xen tree to see how to use this. >> >>>> + pv_inject_hw_exception(TRAP_gp_fault, 0); >>> As I said on IRC, we do want this, but I'm not certain if we can get >>> away with just enabling it in debug builds. _PAGE_GNTTAB was ok because >>> it has always been like that, but there's a non-trivial chance that >>> there are existing dom0 kernels which violate this constraint. >> I chose this approach because it is very simple to implement. Anything >> more complex ought to be in a separate patch, IMO. >> >>>> + return -EINVAL; >>>> +#endif > As to "complex": Just the "return -EINVAL" would be yet less complex. > Irrespective of the remark my understanding of Andrew's response is that > the concern extends to the error return as well.
It's a tradeoff. From a debugging point of view, #GP is absolutely the way to go, because you get a full backtrace out in Linux. It doesn't matter in the slightest which side of the SYSCALL instruction it points at, because that's not the interesting information to look at. Returning -EINVAL stops the problematic cache attributes from being used, but is far more variable in terms of noticeable change inside Linux. It ranges from hitting a BUG(), to having the return code lost. In this case, I'd err on the side of a #GP fault because it is a definite error in the guest needing debugging and fixing. But as I said originally, it probably needs to be on-by-default with a knob to disable. ~Andrew