On 22.09.2021 14:58, Andrew Cooper wrote:
> On 22/09/2021 08:01, Jan Beulich wrote:
>> On 21.09.2021 19:51, Andrew Cooper wrote:
>>> On 21/09/2021 07:53, Jan Beulich wrote:
>>>> On 20.09.2021 19:25, Andrew Cooper wrote:
>>>>> --- a/xen/arch/x86/hvm/hvm.c
>>>>> +++ b/xen/arch/x86/hvm/hvm.c
>>>>> @@ -5063,8 +5063,9 @@ long do_hvm_op(unsigned long op, 
>>>>> XEN_GUEST_HANDLE_PARAM(void) arg)
>>>>>          if ( copy_from_guest(&tr, arg, 1 ) )
>>>>>              return -EFAULT;
>>>>>  
>>>>> -        if ( tr.extra_bytes > sizeof(tr.extra)
>>>>> -             || (tr.event & ~((1u<<TRC_SUBCLS_SHIFT)-1)) )
>>>>> +        if ( tr.extra_bytes % sizeof(uint32_t) ||
>>>>> +             tr.extra_bytes > sizeof(tr.extra) ||
>>>>> +             tr.event >> TRC_SUBCLS_SHIFT )
>>>>>              return -EINVAL;
>>>> Despite this being a function that supposedly no-one is to really
>>>> use, you're breaking the interface here when really there would be a
>>>> way to be backwards compatible: Instead of failing, pad the data to
>>>> a 32-bit boundary. As the interface struct is large enough, this
>>>> would look to be as simple as a memset() plus aligning extra_bytes
>>>> upwards. Otherwise the deliberate breaking of potential existing
>>>> callers wants making explicit in the respective paragraph of the
>>>> description.
>>> It is discussed, along with a justification for why an ABI change is fine.
>> What you say is "This has no business being a hypercall in the first
>> place", yet to me that's not a justification to break an interface.
> 
> No, but "cannot be used outside of custom debugging" means there are no
> users in practice, and therefore it really doesn't matter.
> 
>> It is part of the ABI, so disallowing what was previously allowed
>> may break people's (questionable, yes) code.
>>
>>> But I could do
>>>
>>> tr.extra_bytes = ROUNDUP(tr.extra_bytes, sizeof(uint32_t));
>>>
>>> if you'd really prefer.
>> I would, indeed, and as said ideally alongside clearing the excess
>> bytes in the buffer.
> 
> Why?  The entire structure is copied out of guest memory, with a fixed size.
> 
> It's not Xen's fault/problem if the VM didn't initialise it correctly,
> and an explicit ROUNDUP() here maintains the current behaviour.

What I don't understand is what you derive "didn't initialise it correctly"
from. The public header says nothing. The data field being of type uint8_t[]
may very well suggest that any size is fine. Propagating rubbish instead of
predictable values (zeroes) seems wrong to me.

Anyway - I'm not going to insist; I merely think we should be as forgiving
as possible in situations like this.

>>>>> --- a/xen/common/sched/rt.c
>>>>> +++ b/xen/common/sched/rt.c
>>>>> @@ -968,18 +968,20 @@ burn_budget(const struct scheduler *ops, struct 
>>>>> rt_unit *svc, s_time_t now)
>>>>>      /* TRACE */
>>>>>      {
>>>>>          struct __packed {
>>>>> -            unsigned unit:16, dom:16;
>>>>> +            uint16_t unit, dom;
>>>>>              uint64_t cur_budget;
>>>>> -            int delta;
>>>>> -            unsigned priority_level;
>>>>> -            bool has_extratime;
>>>>> -        } d;
>>>>> -        d.dom = svc->unit->domain->domain_id;
>>>>> -        d.unit = svc->unit->unit_id;
>>>>> -        d.cur_budget = (uint64_t) svc->cur_budget;
>>>>> -        d.delta = delta;
>>>>> -        d.priority_level = svc->priority_level;
>>>>> -        d.has_extratime = svc->flags & RTDS_extratime;
>>>>> +            uint32_t delta;
>>>> The original field was plain int, and aiui for a valid reason. I
>>>> don't see why you couldn't use int32_t here.
>>> delta can't be negative, because there is a check earlier in the function.
>> Oh, yes, didn't spot that.
>>
>>> What is a problem is the 63=>32 bit truncation, and uint32_t here is
>>> half as bad as int32_t.
>> Agreed. Whether the truncation is an issue in practice is questionable,
>> as I wouldn't expect budget to be consumed in multiple-second individual
>> steps. But I didn't check whether this scheduler might allow a vCPU to
>> run for this long all in one go.
> 
> I expect it's marginal too.  Honestly, its not a bug I care to fix right
> about now.  I could leave a /* TODO: truncation? */ in place so whomever
> encounters weird behaviour from this trace record has a bit more help of
> where to look?

Would seem reasonable to me, but really needs to be answered by the
maintainers of this code.

Jan


Reply via email to