Re: [PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...

Jan Beulich Tue, 07 Dec 2021 01:55:47 -0800

On 07.12.2021 09:37, Michal Orzel wrote:
> On 06.12.2021 16:29, Julien Grall wrote:
>> On 06/12/2021 14:20, Michal Orzel wrote:
>>> to hypervisor when switching to AArch32 state.
>>>
> I will change to "from AArch32 state".
>>> According to section D1.20.2 of Arm Arm(DDI 0487A.j):
>>> "If the general-purpose register was accessible from AArch32 state the
>>> upper 32 bits either become zero, or hold the value that the same
>>> architectural register held before any AArch32 execution.
>>> The choice between these two options is IMPLEMENTATIONDEFINED"
>>
>> Typo: Missing space between IMPLEMENTATION and DEFINED.
>>
> Ok.
>>>
>>> Currently Xen does not ensure that the top 32 bits are zeroed and this
>>> needs to be fixed.
>>
>> Can you outline why this is a problem and why we need to protect? IIRC, the 
>> main concern is Xen may misinterpret what the guest requested but we are not 
>> concerned about Xen using wrong value.
>>
> I would say:
> "
> The reason why this is a problem is that there are places in Xen where we 
> assume that top 32bits are zero for AArch32 guests.
> If they are not, this can lead to misinterpretation of Xen regarding what the 
> guest requested.
> For example hypercalls returning an error encoded in a signed long like 
> do_sched_op, do_hmv_op, do_memory_op would return -ENOSYS
> if the command passed as the first argument was clobbered,
> "
>>>
>>> Fix this bug by zeroing the upper 32 bits of these registers on an
>>> entry to hypervisor when switching to AArch32 state.
>>>
>>> Set default value of parameter compat of macro entry to 0 (AArch64 mode
>>> as we are on 64-bit hypervisor) to avoid checking if parameter is blank
>>> when not passed.
>>
>> Which error do you see otherwise? Is it a compilation error?
>>
> Yes, this is a compilation error. The errors appear at each line when "entry" 
> is called without passing value for "compat".
> So basically in all the places where entry is called with hyp=1.
> When taking the current patch and removing default value for compat you will 
> get:
> ```
> entry.S:254: Error: ".endif" without ".if"
> entry.S:258: Error: symbol `.if' is already defined
> entry.S:258: Error: ".endif" without ".if"
> entry.S:262: Error: symbol `.if' is already defined
> entry.S:262: Error: ".endif" without ".if"
> entry.S:266: Error: symbol `.if' is already defined
> entry.S:266: Error: ".endif" without ".if"
> entry.S:278: Error: symbol `.if' is already defined
> entry.S:278: Error: ".endif" without ".if"
> entry.S:292: Error: symbol `.if' is already defined
> entry.S:292: Error: ".endif" without ".if"
> entry.S:317: Error: symbol `.if' is already defined
> entry.S:317: Error: ".endif" without ".if"
> ```


An alternative might be to use

.if 0\compat

>>> --- a/xen/arch/arm/arm64/entry.S
>>> +++ b/xen/arch/arm/arm64/entry.S
>>> @@ -109,8 +109,16 @@
>>>    * If 0, we rely on the on x0/x1 to have been saved at the correct
>>>    * position on the stack before.
>>>    */
>>> -        .macro  entry, hyp, compat, save_x0_x1=1
>>> +        .macro  entry, hyp, compat=0, save_x0_x1=1
>>>           sub     sp, sp, #(UREGS_SPSR_el1 - UREGS_LR) /* CPSR, PC, SP, LR 
>>> */
>>> +
>>> +        /* Zero the upper 32 bits of the registers when switching to 
>>> AArch32 */
>>> +        .if \compat == 1      /* AArch32 mode */
>>> +        .irp 
>>> nr,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29
>>> +        mov w\nr, w\nr
>>> +        .endr
>>> +        .endif
>>
>> So Jan mentioned, the x0/x1 may have already been saved. So you may need to 
>> fetch them from the stack and then clobber the top 32-bit.
>>
> So I would do the following:
> -fetch x0/x1 from the stack
> -clobber them
> -store them again on the stack
> 
> /*
>  * Zero the upper 32 bits of the gp registers when switching
>  * from AArch32.
>  */
> .if \compat == 1      /* AArch32 mode */
> 
> /* x0/x1 have already been saved so fetch them to zero top 32 bits */
> .if \save_x0_x1 == 0
> ldp     x0, x1, [sp], #(UREGS_kernel_sizeof - UREGS_X0)
> .endif
> 
> .irp 
> nr,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29
> mov w\nr, w\nr
> .endr
> 
> .if \save_x0_x1 == 0
> stp     x0, x1, [sp, #-(UREGS_kernel_sizeof - UREGS_X0)]
> .endif
> 
> .endif

Wouldn't it be more efficient to store 32 bits of zero each into the
high halves of the respective stack slots? Afaict same code size, but
less memory / cache traffic. Plus it would avoid the latent issue of
a user of the macro actually expecting the two registers to retain
their values across the macro invocation.

I'm also puzzled by the two different memory addressing forms, but I
can easily see that I may be lacking enough Arm knowledge there.

Jan

Re: [PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...

Reply via email to