Hi Andrew,

Andrew Cooper <andrew.coop...@citrix.com> writes:

> On 03/12/2024 11:16 pm, Julien Grall wrote:
>> On Tue, 3 Dec 2024 at 22:00, Andrew Cooper <andrew.coop...@citrix.com> wrote:
>>> On 30/11/2024 1:10 am, Volodymyr Babchuk wrote:
>>>> diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
>>>> index 2e27af4560..f855e97e25 100644
>>>> --- a/xen/arch/arm/setup.c
>>>> +++ b/xen/arch/arm/setup.c
>>>> @@ -341,6 +342,8 @@ void asmlinkage __init start_xen(unsigned long 
>>>> fdt_paddr)
>>>>       */
>>>>      system_state = SYS_STATE_boot;
>>>>
>>>> +    boot_stack_chk_guard_setup();
>>>> +
>>>>      if ( acpi_disabled )
>>>>      {
>>>>          printk("Booting using Device Tree\n");
>>> I still think that __stack_chk_guard wants setting up in ASM before
>>> entering C.
>>>
>>> The only reason this call is so late is because Xen's get_random()
>>> sequence is less than helpful.  That wants rewriting somewhat, but maybe
>>> now isn't the best time.
>>>
>>> Even if you initialise __stack_chk_guard it to -1 rather than 0, it's
>>> still got a better chance of catching errors during very early boot; the
>>> instrumentation is present, but is using 0 as the canary value.
>>>
>>> On x86, dumping the current TSC value into __stack_chk_guard would be
>>> far better than using -1.  Even if it skewed to a lower number, it's
>>> unpredictable and not going to reoccur by accident during a stack overrun.
>>>
>>> Surely ARM has something similar it could use?
>> There is a optional system register to read a random number.
>
> Only in v8.5 as far as I can see, and even then it's not required. 
> Also, it suffers from the same problem as RDRAND on x86; we need to boot
> to at least feature detection before we can safely use it if it's available.
>
>>
>>> [edit] Yes, get_cycles(), which every architecture seems to have.  In
>>> fact, swapping get_random() from NOW() to get_cycles() would be good
>>> enough to get it usable from early assembly.
>> Not quite. Technically we can't rely on the timer counter until
>> platform_init_time() is called.
>> This was to cater an issue on the exynos we used in OssTest. But
>> arguably this is the exception
>> rather than the norm because the firmware ought to properly initialize
>> the timer...
>>
>> I haven't checked recent firmware. But I could be convinced to access
>> the counter before
>> hand if we really think that setting __stack_chk_guard from ASM is much 
>> better.
>
> The C instrumentation is always present, right from the very start of
> start_xen().
>
> Even working with a canary of 0, there's some value.  It will spot
> clobbering with a non-zero value, but it won't spot e.g. an overly-long
> memset(, 0).
>
> During boot, we're not defending against a malicious entity.  Simply
> defending against bad developer expectations.
>
> Therefore, anything to get a non-zero value prior to entering C will be
> an improvement.  Best-effort is fine, and if there's one platform with
> an errata that causes it to miss out, then oh well.  Any other platform
> which manifests a crash will still lead to the problem being fixed.
>
> I suppose taking this argument to it's logical conclusion, we could
> initialise __stack_chk_guard with a poison pattern, although not one
> shared by any other poison pattern in Xen.  That alone would be better
> than using 0 in early boot.

Okay, so I come with three-stage initialization:

1. Static poison pattern
2. Time-based early value
3. Random-number based long-term value

So, apart from already present

static always_inline void boot_stack_chk_guard_setup(void);

I did this:

/*
 * Initial value is chosen by fair dice roll.
 * It will be updated during boot process.
 */
#if BITS_PER_LONG == 32
unsigned long __ro_after_init __stack_chk_guard = 0xdd2cc927;
#else
unsigned long __ro_after_init __stack_chk_guard = 0x2d853605a4d9a09c;
#endif

/* This function should be called from ASM only */
void __init asmlinkage boot_stack_chk_guard_setup_early(void)
{
    /*
     * Linear congruent generator. Constant is taken from
     * Tables Of Linear Congruential Generators
     * Of Different Sizes And Good Lattice Structure by Pierre L’Ecuyer
     */
#if BITS_PER_LONG == 32
    const unsigned long a = 2891336453;
#else
    const unsigned long a = 2862933555777941757;
#endif
    const unsigned c = 1;
    unsigned long cycles = get_cycles();

    if ( !cycles )
        return;

    __stack_chk_guard = cycles * a + c;
}

boot_stack_chk_guard_setup_early() is being called by ASM code during
early boot stages.

Then, later, boot_stack_chk_guard_setup() is called.

If you are okay with this approach, I'll send the next version.

-- 
WBR, Volodymyr

Reply via email to