On 13/03/18 17:14, Julien Grall wrote:
> Hi Andre,
> On 13/03/18 17:02, Andre Przywara wrote:
>> On 08/03/18 15:39, Julien Grall wrote:
>>> On 05/03/18 16:03, Andre Przywara wrote:
>>>> +/*
>>>> + * We don't actually support clearing the active state of an IRQ
>>>> (yet).
>>>> + * However there is a chance that most guests use this for
>>>> initialization.
>>>> + * We check whether this MMIO access would actually affect any active
>>>> IRQ,
>>>> + * and only print our warning in this case. So clearing already
>>>> non-active
>>>> + * IRQs would not be moaned about in the logs.
>>>> + */
>>>> +void vgic_mmio_write_cactive(struct vcpu *vcpu,
>>>> +                             paddr_t addr, unsigned int len,
>>>> +                             unsigned long val)
>>>> +{
>>>> +    uint32_t intid = VGIC_ADDR_TO_INTID(addr, 1);
>>>> +    unsigned int i;
>>>> +    bool bail_out = false;
>>>> +
>>>> +    for_each_set_bit( i, &val, len * 8 )
>>>> +    {
>>>> +        struct vgic_irq *irq = vgic_get_irq(vcpu->domain, vcpu, intid
>>>> + i);
>>>> +
>>>> +        /*
>>>> +         * If we know that the IRQ is active or we can't be sure about
>>>> +         * it (because it is currently in a CPU), log the not properly
>>>> +         * emulated MMIO access.
>>>> +         */
>>>> +        if ( irq->active || irq->vcpu )
>>>> +        {
>>>> +            gdprintk(XENLOG_ERR,
>>>> +                     "%pv: vGICD: IRQ%d: clearing active state not
>>>> supported\n",
>>> s/%d/%u/
>>>> +                     vcpu, irq->intid);
>>> gdprintk will always print the vCPU. Thought it is the current which
>>> might be different from vcpu (mostly in the re-dist case).
>> Ah, thanks. I always get confused about what which version of *printk
>> does.
>>> So I would use dprintk(XENLOG_G_ERR, "%pv: ..."). I would even be tempt
>>> to use printk(....) so we can spot potential issue on non-debug build.
>> Well, in the true spirit of Xen paranoia ;-) I wanted to avoid a guest
>> spamming the console.
> The guests messages are rate limited.

Ah ...

>> And in the end there is nothing a administrator
>> could really do about it. In my experience those messages tend to really
>> scare users ("I could boot Dom0 but I see those error messages ....").
> Xen message are not only here for the administrator, they are also here
> to help for the developer to get log to dissect.

Sure, see below ...

> I think that particular message should be printed in non-debug build
> because if the interrupt was active and can't clear it. Then something
> will go wrong later on.

Fair enough, if it's rate limited ...

>>>> +            bail_out = true;
>>> I admit the bailout is a bit weird here. You would only print the
>>> warning for the first activated IRQ and give the impression it is fine
>>> for the rest. So maybe you want to drop IRQ%d?
>> For the above reasons I wanted to keep them concise, so that we see that
>> the issue has happened, but avoid getting tons of error messages about
>> the same problem (as this may affect up to 32 IRQs).
>> But for debugging it might be good to know which IRQ was affected. I see
>> two use cases for a guest:
>> - (De-)activating a single IRQ: we get one message and know which IRQ it
>> was, so an admin can chase this down to a certain device (driver).
>> - (De-)activating *every* IRQ in this range (~0): we still get one
>> message per 32 IRQs, but can see whether it covers SPIs only (IRQ>=32)
>> and which ones.
>> So what about a compromise: I use dprintk(XENLOG_G_ERR, "%pv ...), print
>> the (first) IRQ and the *value* to be written. So a knowledgeable admin
>> can tell whether it's a single IRQ or a "clear/set-all" case. That
>> should also give enough info for debugging, but keeps it short.
> I can't see how a knowledgeable admin will be able to know that IRQ 2 is
> active with just the register value.

Well, I was assuming that a really knowledgeable admin would somehow
forward the error message either to the ML or at least to $search_engine.

>> Does that sound OK?
> I would still prefer the one per IRQ and using printk(XENLOG_G_*).

I really don't think one per IRQ is too useful. A developer however can
easily deal with "IRQ45, value: 0x00802000" from a log. And can deduce
from there that it's about IRQ45 and IRQ55. Following the example above
you would either see one "IRQ32, value: 0xffffffff" or "IRQ 45, value:
That looks like a good compromise between readability (having the IRQ
number for admins) and brevity.

I changed it now to output:
%pv: vGICD: clearing active state not supported (IRQ%u, value: 0x%08lx)

> I don't much care about the spam, see why above.

Having them on the console between Dom0 messages is really scary, but
not helpful if it's *more* than one. Since it's a known limitation of
the VGIC emulation, not a real "error" in that sense.


Xen-devel mailing list

Reply via email to