On 16.10.2025 17:34, Roger Pau Monné wrote:
> On Thu, Oct 16, 2025 at 01:57:41PM +0200, Jan Beulich wrote:
>> On 16.10.2025 13:42, Roger Pau Monné wrote:
>>> On Thu, Oct 16, 2025 at 09:31:42AM +0200, Jan Beulich wrote:
>>>> Keeping channels enabled when they're unused is only causing problems:
>>>> Extra interrupts harm performance, and extra nested interrupts could even
>>>> have caused worse problems.
>>>>
>>>> Note that no explicit "enable" is necessary - that's implicitly done by
>>>> set_channel_irq_affinity() once the channel goes into use again.
>>>>
>>>> Along with disabling the counter, also "clear" the channel's "next event",
>>>> for it to be properly written by whatever the next user is going to want
>>>> (possibly avoiding too early an IRQ).
>>>>
>>>> Further, along the same lines, don't enable channels early when starting
>>>> up an IRQ. This similarly should happen no earlier than from
>>>> set_channel_irq_affinity() (here: once a channel goes into use the very
>>>> first time). This eliminates a single instance of
>>>>
>>>> (XEN) [VT-D]INTR-REMAP: Request device [0000:00:1f.0] fault index 0
>>>> (XEN) [VT-D]INTR-REMAP: reason 25 - Blocked a compatibility format 
>>>> interrupt request
>>>>
>>>> during boot. (Why exactly there's only one instance, when we use multiple
>>>> counters and hence multiple IRQs, I can't tell. My understanding would be
>>>> that this was due to __hpet_setup_msi_irq() being called only after
>>>> request_irq() [and hence the .startup handler], yet that should have
>>>> affected all channels.)
>>>>
>>>> Fixes: 3ba523ff957c ("CPUIDLE: enable MSI capable HPET for timer 
>>>> broadcast")
>>>> Signed-off-by: Jan Beulich <[email protected]>
>>>> ---
>>>> A window still remains for IRQs to be caused by stale comparator values:
>>>> hpet_attach_channel() is called ahead of reprogram_hpet_evt_channel().
>>>> Should we also write the comparator to "far into the future"?
>>>
>>> It might be helpful to reprogram the comparator as far ahead as
>>> possible in hpet_attach_channel() ahead of enabling it, or
>>> alternatively in hpet_detach_channel().
>>
>> The downside is yet another (slow) MMIO access. Hence why I didn't make
>> such a change right away. Plus I wasn't quite sure about the locking there:
>> Imo if we did so, it would be better if the lock wasn't dropped
>> intermediately.
>>
>>>> @@ -542,6 +540,8 @@ static void hpet_detach_channel(unsigned
>>>>          spin_unlock_irq(&ch->lock);
>>>>      else if ( (next = cpumask_first(ch->cpumask)) >= nr_cpu_ids )
>>>>      {
>>>> +        hpet_disable_channel(ch);
>>>> +        ch->next_event = STIME_MAX;
>>>>          ch->cpu = -1;
>>>>          clear_bit(HPET_EVT_USED_BIT, &ch->flags);
>>>>          spin_unlock_irq(&ch->lock);
>>>
>>> I'm a bit confused with what the HPET code does here (don't know
>>> enough about it, and there are no comments).  Why is the timer rotated
>>> to a CPU in ch->cpumask once disabled, instead of just being plain
>>> disabled?
>>
>> Because it will still be needed by the other CPUs that the channel is
>> shared with.
> 
> Yeah, missed that part, the channel is migrated to a different CPU.  I
> wonder however: since an active channel can be migrated around between
> CPUs, isn't there a risk of the timer firing just in the middle of
> migration (when interrupt generation is disabled), and hence Xen
> possibly missing a deadline?
> 
> In hpet_broadcast_exit() we need to check whether the timer has
> expired after the migration, and manually trigger a broadcast if
> needed.  This also risks doing to broadcasts also back-to-back, but
> it's the only option I see to avoid missing a deadline.
> 
> Maybe there's something I'm missing, this is all fairly complex.

set_channel_irq_affinity() invokes handle_hpet_broadcast() to cover that
case.

Jan

Reply via email to