On 23/09/2024 2:06 pm, Roger Pau Monné wrote:
> On Mon, Sep 23, 2024 at 12:17:51PM +0100, Andrew Cooper wrote:
>> On 20/09/2024 10:36 am, Roger Pau Monne wrote:
>>> alternatives is used both at boot time, and when loading livepatch payloads.
>>> While for the former it makes sense to panic, it's not useful for the 
>>> later, as
>>> for livepatches it's possible to fail to load the livepatch if alternatives
>>> cannot be resolved and continue operating normally.
>>>
>>> Relax the BUGs in _apply_alternatives() to instead return an error code if
>>> alternatives are applied after boot.
>>>
>>> Add a dummy livepatch test so the new logic can be assessed, with the change
>>> applied the loading now fails with:
>>>
>>> alt table ffff82d040602024 -> ffff82d040602032
>>> livepatch: xen_alternatives_fail applying alternatives failed: -22
>>>
>>> Signed-off-by: Roger Pau Monné <roge.r...@citrix.com>
>>> ---
>>>  xen/arch/x86/alternative.c                 | 29 ++++++++++++++++------
>>>  xen/arch/x86/include/asm/alternative.h     |  3 ++-
>>>  xen/common/livepatch.c                     | 10 +++++++-
>>>  xen/test/livepatch/Makefile                |  5 ++++
>>>  xen/test/livepatch/xen_alternatives_fail.c | 29 ++++++++++++++++++++++
>>>  5 files changed, 66 insertions(+), 10 deletions(-)
>>>  create mode 100644 xen/test/livepatch/xen_alternatives_fail.c
>>>
>>> diff --git a/xen/arch/x86/alternative.c b/xen/arch/x86/alternative.c
>>> index 7824053c9d33..c0912cb12ea5 100644
>>> --- a/xen/arch/x86/alternative.c
>>> +++ b/xen/arch/x86/alternative.c
>>> @@ -174,10 +174,13 @@ extern void *const __initdata_cf_clobber_end[];
>>>   * The caller will set the "force" argument to true for the final
>>>   * invocation, such that no CALLs/JMPs to NULL pointers will be left
>>>   * around. See also the further comment below.
>>> + *
>>> + * Note the function cannot fail if system_state < SYS_STATE_active, it 
>>> would
>>> + * panic instead.  The return value is only meaningful for runtime usage.
>>>   */
>>> -static void init_or_livepatch _apply_alternatives(struct alt_instr *start,
>>> -                                                  struct alt_instr *end,
>>> -                                                  bool force)
>>> +static int init_or_livepatch _apply_alternatives(struct alt_instr *start,
>>> +                                                 struct alt_instr *end,
>>> +                                                 bool force)
>>>  {
>>>      struct alt_instr *a, *base;
>>>  
>>> @@ -198,9 +201,17 @@ static void init_or_livepatch 
>>> _apply_alternatives(struct alt_instr *start,
>>>          uint8_t buf[MAX_PATCH_LEN];
>>>          unsigned int total_len = a->orig_len + a->pad_len;
>>>  
>>> -        BUG_ON(a->repl_len > total_len);
>>> -        BUG_ON(total_len > sizeof(buf));
>>> -        BUG_ON(a->cpuid >= NCAPINTS * 32);
>>> +#define BUG_ON_BOOT(cond)                                       \
>>> +    if ( system_state < SYS_STATE_active )                      \
>>> +        BUG_ON(cond);                                           \
>>> +    else if ( cond )                                            \
>>> +        return -EINVAL;
>>> +
>>> +        BUG_ON_BOOT(a->repl_len > total_len);
>>> +        BUG_ON_BOOT(total_len > sizeof(buf));
>>> +        BUG_ON_BOOT(a->cpuid >= NCAPINTS * 32);
>>> +
>>> +#undef BUG_ON_BOOT
>> The "error handling" before was horrible and this isn't really any better.
>>
>> This function should always return int, printing more helpful info than
>> that (a printk() says a thousand things better than a BUG()), and
>> nmi_apply_alternatives can panic() rather than leaving these BUG()s here.
> OK, will rework the logic here so it's the caller that panics (or not)
> as necessary, and _apply_alternatives() always prints some error
> message.

Yes.  With alternatives now behind sensible checks in the livepatch
case, any failure here is relevant and wants printing.

>
>> As a tangent, the __must_check doesn't seem to have been applied to
>> nmi_apply_alternatives(), but I'd suggest dropping the attribute; I
>> don't think it adds much.
> I didn't see the value in adding the attribute to
> nmi_apply_alternatives(), as in that context _apply_alternatives()
> would unconditionally panic instead of returning an error code.

Ah, it was only apply_alternatives() you made __must_check, not
_apply_alternatives(), which is why there isn't a compile error in
nmi_apply_alternatives().

Still, I don't think the __must_check is much use.

>
>>> diff --git a/xen/test/livepatch/xen_alternatives_fail.c 
>>> b/xen/test/livepatch/xen_alternatives_fail.c
>>> new file mode 100644
>>> index 000000000000..01d289095a31
>>> --- /dev/null
>>> +++ b/xen/test/livepatch/xen_alternatives_fail.c
>>> @@ -0,0 +1,29 @@
>>> +/*
>>> + * Copyright (c) 2024 Cloud Software Group.
>>> + *
>>> + */
>>> +
>>> +#include "config.h"
>>> +#include <xen/lib.h>
>>> +#include <xen/livepatch_payload.h>
>>> +
>>> +#include <asm/alternative.h>
>>> +#include <asm/cpuid.h>
>>> +
>>> +void test_alternatives(void)
>>> +{
>>> +    alternative("", "", NCAPINTS * 32);
>>> +}
>>> +
>>> +/* Set a hook so the loading logic in Xen don't consider the payload 
>>> empty. */
>>> +LIVEPATCH_LOAD_HOOK(test_alternatives);
>>> +
>>> +/*
>>> + * Local variables:
>>> + * mode: C
>>> + * c-file-style: "BSD"
>>> + * c-basic-offset: 4
>>> + * tab-width: 4
>>> + * indent-tabs-mode: nil
>>> + * End:
>>> + */
>> The second half of the patch (new testcase) is all fine and good, but
>> should pass with patch 2 in place?  I'd suggest splitting it out.
> No, not really.  The Xen buildid for this patch will be correctly set
> to match the running one, but the alternatives feature CPUID is
> explicitly set to an out of range value (NCAPINTS * 32) to trigger the
> BUG_ON condition.

Ah yes.  Good point.

In which case it probably ought to stay in this patch.

> Further thinking about it, I think we should add a build time assert
> that the feature parameters in the alternative calls are smaller than
> NCAPINTS * 32.

A build check where?  It's quite hard to do in alternatives themselves.

~Andrew

Reply via email to