Thanks for the clarification, Andrew.
On Tue, Feb 20, 2018 at 5:20 PM, Andrew Cooper <andrew.coop...@citrix.com>
> On 21/02/2018 00:42, Andres Lagar Cavilla wrote:
> > Hello everyone,
> > I was thinking of the traditional Xen PV mode in which page table
> > pages are write protected from guest meddling and PTE
> > modifications are audited by the hypervisor (ptwr_emulated_update()
> > these days, still?).
> Something like that, yes. Alternatively, via explicit hypercall which
> is faster than the trap&emulate path.
> > Without software shadows or paging to e.g. an EPT, native PV loads the
> > actual CR3 pointing to a write protected page table tree.
> Unfortunately, I've lost you here. There is no such thing as a
> write-protected pagetable tree in the traditional PV sense.
> > When the cr3 is loaded, the hardware walker will want to set A and D
> > bits in PTEs -- is this action immune to the write protection in the
> > page table pages themselves? Or do we take emulation faults on these
> > updates as well?
> The protection that Xen enforces on PV guests is that an L1 PTE mapping
> a pagetable frame must never be writeable. This protection happens at
> the linear address level. When the CPU pagewalker tries to set A/D
> bits, it issues an atomic update to the physical address of the
> pagetable entry which needs updating.
> As with everything, there are complicating factors. With EPT/NPT for
> HVM guests these days, the hypervisor can also apply permissions to
> guest physical addresses, as part of their translation to host physical
> addresses. The hardware pagewalker, when attempting to set an A/D bit
> of the HVM guests regular pagetables, issues an EPT/NPT write (well -
> RMW strictly) to set the bits.
> Therefore, if the hypervisor marks an HVM guest's pagetable as
> read-only, then the hardware pagewalker trying to set A/D bits will
> vmexit with an EPT/NPT permissions violation. This is one major
> performance limiting factor of introspection technology at the moment.
Indeed, this is what I was coming at. In my experience guests will be very
adversely affected if we just latch the D bits to 1 unilaterally (it's
legal to do so by the "hardware"), as they will be led to believe file
cache pages are in constant need of writeback. (and A bits latched to 1
turn e.g. Linux's vmscan.c into a crapshoot)
So this is currently not too hopeful
Xen-devel mailing list