On 06/01/2026 7:59 am, Jan Beulich wrote:
>>>> @@ -489,17 +484,17 @@ void xrstor(struct vcpu *v, uint64_t mask)
>>>>              ptr->xsave_hdr.xcomp_bv = 0;
>>>>          }
>>>>          memset(ptr->xsave_hdr.reserved, 0, 
>>>> sizeof(ptr->xsave_hdr.reserved));
>>>> -        continue;
>>>> +        goto retry;
>>>>  
>>>>      case 2: /* Stage 2: Reset all state. */
>>>>          ptr->fpu_sse.mxcsr = MXCSR_DEFAULT;
>>>>          ptr->xsave_hdr.xstate_bv = 0;
>>>>          ptr->xsave_hdr.xcomp_bv = v->arch.xcr0_accum & XSTATE_XSAVES_ONLY
>>>>              ? XSTATE_COMPACTION_ENABLED : 0;
>>>> -        continue;
>>>> -    }
>>>> +        goto retry;
>>>>  
>>>> -        domain_crash(current->domain);
>>>> +    default: /* Stage 3: Nothing else to do. */
>>>> +        domain_crash(v->domain, "Uncorrectable XRSTOR fault\n");
>>>>          return;
>>> There's an unexplained change here as to which domain is being crashed.
>>> You switch to crashing the subject domain, yet if that's not also the
>>> requester, it isn't "guilty" in causing the observed fault.
>> So dom0 should be crashed because there bad data in the migration stream?
> Well, I'm not saying the behavior needs to stay like this, or that's it's
> the best of all possible options. But in principle Dom0 could sanitize the
> migration stream before passing it to Xen. So it is still first and foremost
> Dom0 which is to blame.

BNDCFGU contains a pointer which, for PV context, needs access_ok(), not
just a regular canonical check.  Most supervisor states are in a similar
position.

Just because Xen has managed to get away without such checks (by not yet
supporting a state where it matters), I don't agree that its safe to
trust dom0 to do this.


For this case, it's v's xstate buffer which cannot be loaded, so it's v
which cannot be context switched into, and must be crashed.  More below.


>> v is always curr.
> Not quite - see xstate_set_init().

Also more below.

> And for some of the callers of
> hvm_update_guest_cr() I also don't think they always act on current. In
> particular hvm_vcpu_reset_state() never does, I suppose (not the least
> because of the vcpu_pause() in its sole caller).

We discussed the need to not be remotely poking register state like
that.  But I don't see where the connection is between
hvm_update_guest_cr() and xsave()/xrstor().

Tangent: hvm_vcpu_reset_state() is terribly named as it's attempting to
put the vCPU into the INIT state, not the #RESET set.

But it only operates on the xstate header in memory while the target is
de-scheduled.  It's not using XSAVE/XRSTOR to load the results into
registers as far as I can tell.

>
>>   XRSTOR can't be used correctly outside of the subject context,
> Then are you suggesting e.g. xstate_set_init() is buggy?

No, but it switches into enough of v's context to function.  Really its
neither current nor remote context.

But, it's single caller is adjust_bnd() in the emulator so it's always
actually current context with a no-op on xcr0.

As said on Matrix, I think it's going to be necessary to remove MPX to
continue the XSAVE cleanup.

~Andrew

Reply via email to