Hi guys

I have found the problem (after hours and hours of gruesome
debugging with the almighty print) and it seems that this could potentially
have quite a bit of impact if altp2m is enabled for a guest domain (even if
the
functionality is never actively used), since destroying any vcpu of this
guest could lead to a hypervisor panic.
So a malicious user could simply destroy and restart his VM(s) in order to
DOS the VMs of other users by killing the hypervisor.
Granted, this is not very effective, but, depending on the environment, it
is extremely easy to implement.
The bug persists in Xen 4.7 and I do not that it was fixed in the current
master branch.

The following happens.
The call
void hvm_vcpu_destroy(struct vcpu *v)
{
    hvm_all_ioreq_servers_remove_vcpu(v->domain, v);
    if ( hvm_altp2m_supported() )
        altp2m_vcpu_destroy(v);

at some time reaches vmx_vcpu_update_eptp which ends with a
vmx_vmcs_exit(v);.
There vmx_clear_vmcs(v); -> __vmx_clear_vmcs  is called where the
current_vmcs is invalidated if the current vmcs in the CPU is the same as
virt_to_maddr (v->arch.hvm_vmx->vmcs):

__vmpclear(virt_to_maddr(arch_vmx->vmcs)); (
http://www.tptp.cc/mirrors/siyobik.info/instruction/VMCLEAR.html )

To check this assumption I implemented a basic __vmptrst (
http://www.tptp.cc/mirrors/siyobik.info/instruction/VMPTRST.html ) and added
the result to the debug output.
(XEN) vmcs.c:519:IDLEv4 __vmx_clear_vmcs: realVMCS BEFORE __vmpclear
82415a000 
(XEN) vmcs.c:522:IDLEv4 __vmx_clear_vmcs: realVMCS AFTER __vmpclear
ffffffffffffffff

After that no vmcs_load / enter is executed so the vmcs in the CPU remains
invalidated.

For the next function in hvm_vcpu_destroy, the nestedhvm_vcpu_destroy(v) the
missing vmcs is no problem (at least in our use case), but the
free_compat_arg_xlat crashes.
The callstack is as follows:
hvm_vcpu_destroy
free_compat_arg_xlat
destroy_perdomain_mapping
map_domain_page
(probably inlined) mapcache_current_vcpu
sync_local_execstate
__sync_local_execstate
__context_switch
(with function pointer v->arch.ctxt_switch_from = vmx_ctxt_switch_from)
vmx_ctxt_switch_from 
(probably inlined) vmx_fpu_leave

There a vmwrite is tried if either ( !(v->arch.hvm_vmx.host_cr0 &
X86_CR0_TS) ) or ( !(v->arch.hvm_vcpu.guest_cr[0] & X86_CR0_TS) ) is true.
The executed vmwrite then crashes.
As my knowledge of Xen is not that comprehensive, could you tell me when the
TS-bits are set / cleared and what they are used for?

static void vmx_fpu_leave(struct vcpu *v)
{
    ASSERT(!v->fpu_dirtied);
    ASSERT(read_cr0() & X86_CR0_TS);

    if ( !(v->arch.hvm_vmx.host_cr0 & X86_CR0_TS) )
    {
        v->arch.hvm_vmx.host_cr0 |= X86_CR0_TS;
        __vmwrite(HOST_CR0, v->arch.hvm_vmx.host_cr0);
    }

    if ( !(v->arch.hvm_vcpu.guest_cr[0] & X86_CR0_TS) )
    {
        v->arch.hvm_vcpu.hw_cr[0] |= X86_CR0_TS;
        __vmwrite(GUEST_CR0, v->arch.hvm_vcpu.hw_cr[0]);
        v->arch.hvm_vmx.exception_bitmap |= (1u << TRAP_no_device);
        vmx_update_exception_bitmap(v);
    }
}

In the crash dump the additional debug output shows that at least one
__vmwrite will be tried and that the VMCS in the CPU is invalidated:
(XEN) vmx.c:698:IDLEv4 vmx_fpu_leave: vcpu ffff8300defae000 vmcs
ffff8301586c9000 host_cr0-case FALSE guest_cr[0]-case TRUE curr
ffff8300df2fb000 curr->arch.hvm_vmx.vmcs 0000000000000000 realVMCS
ffffffffffffffff

As a quick fix I patched the fpu_leave to only allow the __vmwrite when the
realVMCS is valid.
This seems to work fine, but requires a call to __vmptrst every time
vmx_fpu_leave is called. Also I do not know if an ignored TS has any
negative consequences when destroying a vcpu. I assume that this is not
case. In our tests nothing pointed to any problems.

I added the patch to enable altp2m unconditionally and a patch which evades
the panic in vmx_fpu_leave.
They are not pretty or anywhere near production ready, but I think you will
get the idea.
I tried to implement the __vmptrst with the #ifdef HAVE_GAS_VM parts (
analogue to the other functions in vmx.h ) but failed miserably since I lack
the required knowledge about the OPCODE definitions. :-D

Cheers

Kevin

> -----Urspr√ľngliche Nachricht-----
> Von: Andrew Cooper [mailto:andrew.coop...@citrix.com]
> Gesendet: Montag, 22. August 2016 13:58
> An: Mayer, Kevin <kevin.ma...@gdata.de>; jbeul...@suse.com
> Cc: xen-devel@lists.xen.org
> Betreff: Re: AW: [Xen-devel] Xen 4.6.1 crash with altp2m enabledbydefault
> 
> On 19/08/16 11:01, kevin.ma...@gdata.de wrote:
> > Hi
> >
> > I took another look at Xen and a new crashdump.
> > The last successful __vmwrite should be in static void
> > vmx_vcpu_update_vmfunc_ve(struct vcpu *v) [...]
> >     __vmwrite(SECONDARY_VM_EXEC_CONTROL,
> >               v->arch.hvm_vmx.secondary_exec_control);
> > [...]
> > After this the altp2m_vcpu_destroy wakes up the vcpu and is then
> finished.
> >
> > In nestedhvm_vcpu_destroy (nvmx_vcpu_destroy) the vmcs can
> overwritten (but is not reached in our case as far as I can see):
> >     if ( nvcpu->nv_n1vmcx )
> >         v->arch.hvm_vmx.vmcs = nvcpu->nv_n1vmcx;
> >
> > In conclusion:
> > When destroying a domain the altp2m_vcpu_destroy(v); path seems to
> mess up the vmcs which ( only ) sometimes leads to a failed __vmwrite in
> vmx_fpu_leave.
> > That is as far as I can get with my understanding of the Xen code.
> >
> > Do you guys have any additional ideas what I could test / analyse?
> 
> Do you have easy reproduction instructions you could share?  Sadly, this
is
> looking like an issue which isn't viable to debug over email.
> 
> ~Andrew

____________
Virus checked by G Data MailSecurity
Version: AVA 25.8368 dated 21.09.2016
Virus news: www.antiviruslab.com

Attachment: xen-altp2menable.patch
Description: Binary data

Attachment: xen-vmx_altp2m_bug.patch
Description: Binary data

Attachment: smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Reply via email to