On 12/03/2026 8:15 am, Jan Beulich wrote:
> On 11.03.2026 18:58, Andrew Cooper wrote:
>> In FRED mode, ERET is stricter than IRET about flags.  Notably this means:
>>
>>  * The vm86 bit (bit 17) and IOPL (bits 12,13) must be clear.
>>  * The sticky-1 reserved bit (bit 2) must be set, so dom0_construct() needs 
>> to
>>    set X86_EFLAGS_MBS in order for a PV dom0 to start.
>>  * All other reserved bits must be clear.
>>
>> Xen has been overly lax with reserved bit handling.  Adjust
>> arch_set_info_guest*() and hypercall_iret() which consume flags to clamp the
>> reserved bits for all guest types.
>>
>> This is a minor ABI change, but by the same argument as commit
>> 9f892f84c279 ("x86/domctl: Stop using XLAT_cpu_user_regs()"); the reserved
>> bits would get clamped like this naturally by hardware when the vCPU is run.
>>
>> The handling of vm86 is also different.  Guests under 32bit Xen really could
>> use vm86 mode, but Long Mode disallows vm86 mode and IRET simply ignores the
>> bit.  Xen's behaviour for a PV32 guest trying to use vm86 mode under a 64bit
>> Xen is to arrange to deliver #GP at the target of the IRET, rather than to
>> fail the IRET itself.
>>
>> However there's no filter filtering in arch_set_info_guest() itself, and it
> Nit: Excess "filter"?

Yes.  I noticed that immediately after sending.

>
>> can't arrange to queue a #GP at the target, so do the next best thing and 
>> fail
>> the hypercall.  This is not expected to create an issue for PV guests, as the
>> result of such an arch_set_info_guest() previously would be to run supposedly
>> Real Mode code as Protected Mode code.
>>
>> This allows PV guests to start when Xen is using FRED mode.
>>
>> Signed-off-by: Andrew Cooper <[email protected]>
> Reviewed-by: Jan Beulich <[email protected]>

Thanks.

>
> Nevertheless, a largely unrelated remark and two suggestions:
>
>> --- a/xen/arch/x86/domain.c
>> +++ b/xen/arch/x86/domain.c
>> @@ -1193,6 +1193,14 @@ int arch_set_info_guest(
>>  
>>              if ( !__addr_ok(c.nat->ldt_base) )
>>                  return -EINVAL;
> Seeing this still in context: I had some trouble locating the position where
> you're making the change, as in my local tree this is long gone. Is there
> any chance we could make progress on "x86/PV: consolidate LDT checks" [1]?

I'll have another look, but this patch is going to need to go in first
as it needs backporting to 4.21.

>
>> +
>> +            /*
>> +             * IRET in Long Mode discards EFLAGS.VM, but in FRED mode ERET
>> +             * cares that it is zero.
>> +             *
>> +             * Guests can't see FRED, so emulate IRET behaviour.
>> +             */
>> +            c.nat->user_regs.rflags &= ~X86_EFLAGS_VM;
>>          }
>>  #ifdef CONFIG_COMPAT
>>          else
>> @@ -1205,6 +1213,18 @@ int arch_set_info_guest(
>>  
>>              for ( i = 0; i < ARRAY_SIZE(c.cmp->trap_ctxt); i++ )
>>                  fixup_guest_code_selector(d, c.cmp->trap_ctxt[i].cs);
>> +
>> +            /*
>> +             * Under 32bit Xen, PV guests could really use vm86 mode.  Under
>> +             * 64bit Xen, vm86 mode can't be entered even by PV32 guests.
>> +             *
>> +             * For backwards compatibility, compat HYPERCALL_iret will 
>> arrange
>> +             * to deliver #GP at the target of the IRET rather than to fail
>> +             * the IRET itself, but we can't arrange for the same behaviour
>> +             * here.  Reject the hypercall as the next best option.
>> +             */
>> +            if ( c.cmp->user_regs.eflags & X86_EFLAGS_VM )
>> +                return -EINVAL;
> Technically we could support VM86 mode, by fully emulating things. Hence I
> think -EOPNOTSUPP would be more appropriate.

Sorry, but I think you're rather too late on that suggestion.  Anyone
wanting vm86 mode can use a VM.

>
>>          }
>>  #endif
> Having all of the EFLAGS handling together would be nice. IOPL and IF handling
> sit further down. Could I talk you into moving these additions down there?

No, but not for ...

>  Yes,
> there are downsides to that: It looks to need another "compat" conditional, 
> and
> it would further the mix of state updates and error checks. Yet I still think
> having all of the EFLAGS stuff together is a benefit.

... these reasons.  The later position is after the point at which it's
buggy to fail the hypercall, because we've already reset the FPU amongst
other things.

This is a dire function in need of a lot of work.  I'm just leaving it
no more broken than it was before.

~Andrew

Reply via email to