On 19/07/18 11:49, Jan Beulich wrote: > Checking the low 5 bits of CR3 is not the job of vmx_load_pdptrs(). > Instead it should #GP upon bad PDPTE values, rather than causing a VM > entry failure. > > Signed-off-by: Jan Beulich <jbeul...@suse.com> > > --- a/xen/arch/x86/hvm/vmx/vmx.c > +++ b/xen/arch/x86/hvm/vmx/vmx.c > @@ -1361,20 +1361,18 @@ static void vmx_set_interrupt_shadow(str > __vmwrite(GUEST_INTERRUPTIBILITY_INFO, intr_shadow); > } > > -static void vmx_load_pdptrs(struct vcpu *v) > +static bool vmx_load_pdptrs(struct vcpu *v) > { > unsigned long cr3 = v->arch.hvm_vcpu.guest_cr[3]; > - uint64_t *guest_pdptes; > + uint64_t *guest_pdptes, valid_mask; > struct page_info *page; > p2m_type_t p2mt; > char *p; > + unsigned int i; > > /* EPT needs to load PDPTRS into VMCS for PAE. */ > if ( !hvm_pae_enabled(v) || (v->arch.hvm_vcpu.guest_efer & EFER_LMA) ) > - return; > - > - if ( (cr3 & 0x1fUL) && !hvm_pcid_enabled(v) ) > - goto crash; > + return true; > > page = get_page_from_gfn(v->domain, cr3 >> PAGE_SHIFT, &p2mt, > P2M_UNSHARE);
cr3 needs truncating to 32 bits before doing this. The upper bits of cr3 can remain set after transitioning away from long mode, which will cause this emulation to do the wrong thing. > if ( !page ) > @@ -1385,34 +1383,47 @@ static void vmx_load_pdptrs(struct vcpu > gdprintk(XENLOG_ERR, > "Bad cr3 on load pdptrs gfn %lx type %d\n", > cr3 >> PAGE_SHIFT, (int) p2mt); > - goto crash; > + domain_crash(v->domain); > + return false; > } > > p = __map_domain_page(page); > > - guest_pdptes = (uint64_t *)(p + (cr3 & ~PAGE_MASK)); > + guest_pdptes = (uint64_t *)(p + (cr3 & 0xfe0)); You can drop p, and guest_pdptes can lose its prefix. > > - /* > - * We do not check the PDPTRs for validity. The CPU will do this during > - * vm entry, and we can handle the failure there and crash the guest. > - * The only thing we could do better here is #GP instead. > - */ > + valid_mask = ((1ULL << v->domain->arch.cpuid->extd.maxphysaddr) - 1) & > + (PAGE_MASK | _PAGE_AVAIL | _PAGE_PRESENT); How did you come across this list? The only valid bits are Present, PWT and PCD, while the upper nibble of control bits is documented as ignored rather than reserved. > + for ( i = 0; i < 4; ++i ) > + if ( (guest_pdptes[i] & _PAGE_PRESENT) && > + (guest_pdptes[i] & ~valid_mask) ) > + { > + if ( v == current ) > + hvm_inject_hw_exception(TRAP_gp_fault, 0); The function is void for the same reason why this isn't correct. We are in the hvm_update_* path rather than the set_* path, which is beyond the point of being able to unwind (and why I haven't yet got around to fixing this function). The only way I can see of fixing this is plumbing everything into a new paging_set_cr3() callback, which can return X86EMUL_EXCEPTION and fail the hvm_set_cr3() call. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel