irq: Migrate IRQs from dyings CPUs

Mykyta Poturai Fri, 12 Dec 2025 02:02:23 -0800

On 16.11.25 13:24, Julien Grall wrote:
> Hi,
> 
> On 12/11/2025 10:51, Mykyta Poturai wrote:
>> Move IRQs from dying CPU to the online ones.
>> Guest-bound IRQs are already handled by scheduler in the process of
>> moving vCPUs to active pCPUs, so we only need to handle IRQs used by Xen
>> itself.
>>
>> If IRQ is to be migrated, it's affinity is set to a mask of all online
>> CPUs. With current GIC implementation, this means they are routed to a
>> random online CPU. This may cause extra moves if multiple cores are
>> disabled in sequence, but should prevent all interrupts from piling up
>> on CPU0 in case of repeated up-down cycles on different cores.
> 
> Wouldn't they eventually all move to CPU0 in the case of suspend/resume 
> or if all the CPUs but CPU0 are turned off and then off? If so, 
> shouldn't we try to rebalance the interrupts?
>


In case of disabling/enabling all cores in a sequence, yes. This was 
designed with the idea of achieving some balancing when 
enabling/disabling some cores for power saving reasons. I agree that 
proper balancing should be implemented, but it is a complex task on its 
own and requires a substantial amount of testing on different hardware 
to prove it is close to optimal. So I think implementing it is out of 
scope for what I’m trying to do here.

If this would be okay, I can implement a relatively simple solution of 
just adding onlined CPUs back to the affinity mask for now. I think it 
should improve the situation for the “switching all cores” case.

>>
>> IRQs from CPU 0 are never migrated, as dying CPU 0 means we are either
>> shutting down compeletely or entering system suspend.
> 
> I can't find a place where __cpu_disable() is called on CPU0. Do you 
> have any pointer? In any case, I am not sure I want to bake that 
> assumption in more places of the code.
> 

I assume it would be called when suspend is implemented. In any case, I 
will rework this to replace the hard check for CPU 0 with the “is it the 
last CPU online” one.

>>
>> Considering that all Xen-used IRQs are currently allocated during init
>> on CPU 0, and setup_irq uses smp_processor_id for the initial affinity.
> 
> Looking at the SMMU driver, we seems to request IRQs at the time the 
> device is attached. So are you sure about this?
> 

Indeed, I have missed that one. I will remove those statements then.

>> This change is not strictly required for correct operation for now, but
>> it should future-proof cpu hotplug and system suspend support in case
>> some kind if IRQ balancing is implemented later.
>>
>> Signed-off-by: Mykyta Poturai <[email protected]>
>>
>> v3->v4:
>> * patch introduced
>> ---
>>   xen/arch/arm/include/asm/irq.h |  2 ++
>>   xen/arch/arm/irq.c             | 39 ++++++++++++++++++++++++++++++++++
>>   xen/arch/arm/smpboot.c         |  2 ++
>>   3 files changed, 43 insertions(+)
>>
>> diff --git a/xen/arch/arm/include/asm/irq.h b/xen/arch/arm/include/ 
>> asm/irq.h
>> index 09788dbfeb..6e6e27bb80 100644
>> --- a/xen/arch/arm/include/asm/irq.h
>> +++ b/xen/arch/arm/include/asm/irq.h
>> @@ -126,6 +126,8 @@ bool irq_type_set_by_domain(const struct domain *d);
>>   void irq_end_none(struct irq_desc *irq);
>>   #define irq_end_none irq_end_none
>> +void evacuate_irqs(unsigned int from);
>> +
>>   #endif /* _ASM_HW_IRQ_H */
>>   /*
>>    * Local variables:
>> diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c
>> index 28b40331f7..b383d71930 100644
>> --- a/xen/arch/arm/irq.c
>> +++ b/xen/arch/arm/irq.c
>> @@ -158,6 +158,45 @@ static int init_local_irq_data(unsigned int cpu)
>>       return 0;
>>   }
>> +static void evacuate_irq(int irq, unsigned int from)
> 
> Any reason why the 'irq' is signed?
> 
>> +{
>> +    struct irq_desc *desc = irq_to_desc(irq);
>> +    unsigned long flags;
>> +
>> +    /* Don't move irqs from CPU 0 as it is always last to be disabled */
> 
> Per above, I am not convinced that we should special case CPU 0. But if 
> we do, then shouldn't this be part of evacuate_irqs() so we don't 
> pointlessly go through all the IRQs?
> 
>> +    if ( from == 0 )
>> +        return;
>> +
>> +    ASSERT(!cpumask_empty(&cpu_online_map));
>> +    ASSERT(!cpumask_test_cpu(from, &cpu_online_map));
>> +
>> +    spin_lock_irqsave(&desc->lock, flags);
>> +    if ( likely(!desc->action) )
>> +        goto out;
>> +
>> +    if ( likely(test_bit(_IRQ_GUEST, &desc->status) ||
>> +                test_bit(_IRQ_MOVE_PENDING, &desc->status)) )
>> +        goto out;
>> +
>> +    if ( cpumask_test_cpu(from, desc->affinity) )
>> +        irq_set_affinity(desc, &cpu_online_map);
> 
> I think it would be worth explaining why we are routing to any CPU 
> online rather than checking whether the affinity has other online CPUs.
> 
> Just to note, I don't have strong opinion either way. It mainly needs to 
> be documented.
> 
>> +
>> +out:
>> +    spin_unlock_irqrestore(&desc->lock, flags);
>> +    return;
>> +}
>> +
>> +void evacuate_irqs(unsigned int from)
>> +{
>> +    int irq;
>  > +> +    for ( irq = NR_LOCAL_IRQS; irq < NR_IRQS; irq++ )
>> +        evacuate_irq(irq, from);
>> +
>> +    for ( irq = ESPI_BASE_INTID; irq < ESPI_MAX_INTID; irq++ )
> 
> AFAICT, irq_to_desc() would not be able to cope with ESPI interrupts 
> when CONFIG_GICV3_ESPI is not set. Has this been tested?
> 
>> +        evacuate_irq(irq, from);
>> +}
>> +
>>   static int cpu_callback(struct notifier_block *nfb, unsigned long 
>> action,
>>                           void *hcpu)
>>   {
>> diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
>> index 7f3cfa812e..46b24783dd 100644
>> --- a/xen/arch/arm/smpboot.c
>> +++ b/xen/arch/arm/smpboot.c
>> @@ -425,6 +425,8 @@ void __cpu_disable(void)
>>       smp_mb();
>> +    evacuate_irqs(cpu);
> 
> I think it would be worth explaining why evacuate_irqs() is called this 
> late in the process.
> 
>  > +>       /* Return to caller; eventually the IPI mechanism will 
> unwind and the
>>        * scheduler will drop to the idle loop, which will call 
>> stop_cpu(). */
>>   }
> 
> Cheers,
> 

-- 
Mykyta

Re: [PATCH v4 4/8] arm/irq: Migrate IRQs from dyings CPUs

Reply via email to