On 29/08/18 13:09, Andrew Cooper wrote: > On 29/08/18 12:00, Olaf Hering wrote: >> On Wed, Aug 29, Andrew Cooper wrote: >> >>> Architecturally speaking, handing #MC back is probably the closest we >>> can get to sensible behaviour, but it is still a bug that Linux is >>> touching the ballooned out page in the first place. >> Well, the issue is that a read crosses a page boundary. If that would be >> forbidden, load_unaligned_zeropad() would not exist. It can not know >> what is in the following page. And such page crossing happens also in >> the unballooned case. Sadly I can not trigger the reported NFS bug >> myself. But it can be enforced by ballooning enough pages so that an >> allocated readdir reply eventually is right in front of a ballooned >> page. > > The Linux bug is not shooting the ballooned page out of the directmap. > Linux should be taking a fatal #PF for that read, because its a virtual > mapping for a frame which Linux has voluntarily elected to make invalid. > > As Xen can't prevent Linux from making/maintaining such an invalid > mapping, throwing #MC back is the next best thing, because terminating > the access with ~0 is just going to hide the bug, and run at a glacial > pace while doing so.
I think you are right: the kernel should in no case access a random page without knowing it is RAM. Hitting a ballooned page is just much more probable than hitting a MMIO page this way. There are _no_ guard pages around MMIO areas, so it could in theory happen that load_unaligned_zeropad() would access MMIO area triggering random behavior. So removing ballooned pages from the directmap just hides an underlying problem. Juergen _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel