On 18.07.2022 14:07, Andrew Cooper wrote:
> On 18/07/2022 10:49, Jan Beulich wrote:
>> On 18.07.2022 11:31, Andrew Cooper wrote:
>>> On 18/07/2022 10:11, Jan Beulich wrote:
>>>> On 15.07.2022 15:26, Andrew Cooper wrote:
>>>>> --- a/xen/tools/check-endbr.sh
>>>>> +++ b/xen/tools/check-endbr.sh
>>>>> @@ -61,19 +61,36 @@ ${OBJDUMP} -j .text $1 -d -w | grep ' endbr64 *$' | 
>>>>> cut -f 1 -d ':' > $VALID &
>>>>>  #    the lower bits, rounding integers to the nearest 4k.
>>>>>  #
>>>>>  #    Instead, use the fact that Xen's .text is within a 1G aligned 
>>>>> region, and
>>>>> -#    split the VMA in half so AWK's numeric addition is only working on 
>>>>> 32 bit
>>>>> -#    numbers, which don't lose precision.
>>>>> +#    split the VMA so AWK's numeric addition is only working on <32 bit
>>>>> +#    numbers, which don't lose precision.  (See point 5)
>>>>>  #
>>>>>  # 4) MAWK doesn't support plain hex constants (an optional part of the 
>>>>> POSIX
>>>>>  #    spec), and GAWK and MAWK can't agree on how to work with hex 
>>>>> constants in
>>>>>  #    a string.  Use the shell to convert $vma_lo to decimal before 
>>>>> passing to
>>>>>  #    AWK.
>>>>>  #
>>>>> +# 5) Point 4 isn't fully portable.  POSIX only requires that $((0xN)) be
>>>>> +#    evaluated as long, which in 32bit shells turns negative if bit 31 
>>>>> of the
>>>>> +#    VMA is set.  AWK then interprets this negative number as a double 
>>>>> before
>>>>> +#    adding the offsets from the binary grep.
>>>>> +#
>>>>> +#    Instead of doing an 8/8 split with vma_hi/lo, do a 9/7 split.
>>>>> +#
>>>>> +#    The consequence of this is that for all offsets, $vma_lo + offset 
>>>>> needs
>>>>> +#    to be less that 256M (i.e. 7 nibbles) so as to be successfully 
>>>>> recombined
>>>>> +#    with the 9 nibbles of $vma_hi.  This is fine; .text is at the start 
>>>>> of a
>>>>> +#    1G aligned region, and Xen is far far smaller than 256M, but leave 
>>>>> safety
>>>>> +#    check nevertheless.
>>>>> +#
>>>>>  eval $(${OBJDUMP} -j .text $1 -h |
>>>>> -    $AWK '$2 == ".text" {printf "vma_hi=%s\nvma_lo=%s\n", substr($4, 1, 
>>>>> 8), substr($4, 9, 16)}')
>>>>> +    $AWK '$2 == ".text" {printf "vma_hi=%s\nvma_lo=%s\n", substr($4, 1, 
>>>>> 9), substr($4, 10, 16)}')
>>>>>  
>>>>>  ${OBJCOPY} -j .text $1 -O binary $TEXT_BIN
>>>>>  
>>>>> +bin_sz=$(stat -c '%s' $TEXT_BIN)
>>>>> +[ "$bin_sz" -ge $(((1 << 28) - $vma_lo)) ] &&
>>>>> +    { echo "$MSG_PFX Error: .text offsets can exceed 256M" >&2; exit 1; }
>>>> ... s/can/cannot/ ?
>>> Why?  "Can" is correct here.  If the offsets can't exceed 256M, then
>>> everything is good.
>> Hmm, the wording then indeed is ambiguous.
> 
> I see your point.  In this case it's meant as "are able to", but this is
> still clearer than using "can't" because at least the text matches the
> check which triggered it.
> 
>>  I read "can" as "are allowed
>> to", when we mean "aren't allowed to". Maybe ".text is 256M or more in
>> size"? If you mention "offsets", then I think the check should be based
>> on actually observing an offset which is too large (which .text size
>> alone doesn't guarantee will happen).
> 
> It's not just .text on its own because the VMA of offset by 2M, hence
> the subtraction of $vma_lo in the main calculation.
> 
> There's no point searching for offsets.  There will be one near the end,
> so all searching for an offset would do is complicate the critical loop.
> 
> How about ".text offsets must not exceed 256M" ?
> 
> That should be unambiguous.

Yes, that reads fine. Thanks.

Jan

Reply via email to