On 14/10/2024 7:26 pm, Marek Marczykowski-Górecki wrote: > Hi, > > It looks like we've identified the second buggy driver that somewhere > assumes PAT is configured as Linux normally do natively - nvidia binary > one this time[3]. The first one affected was i915, but it turned out to be > a bug in Linux mm. It was eventually fixed[1], but it was quite painful > debugging. This time a proper fix is not known yet. Since the previous > issue, Qubes OS carried a patch[2] that changes Xen to use same PAT as > Linux. We recently dropped this patch, since the Linux fix reached all > supported by us branches, but apparently it wasn't all... > > Anyway, would it be useful (and acceptable) for upstream Xen to have > a kconfig option (behind UNSUPPORTED or so) to switch this behavior?
Not UNSUPPORTED - it's bogus and I still want it purged. But, behind EXPERT, with a suitable description (e.g. "This breaks various ABIs including migration, and is presented here for debugging PV driver issues in a single system. If turning it on fixes a bug, please contact upstream Xen"), then I think we need to take it. The fact that I've had to recommend it once already this week for debugging purposes, and it wasn't even this Nvidia bug, demonstrates how pervasive the problems are. > Technically, it's a PV ABI violation, and it does break few things > (definitely PV domU with passthrough are affected - Xen considers them > L1TF vulnerable then; PV live migration is most likely broken too). Do you have more information on this? The PAT bits shouldn't form any part of L1TF considerations. ~Andrew