On 15.10.25 20:30, Mykola Kvach wrote: > Hi Mykyta, > > Thanks for the series. > > It seems there might be issues here -- please take a look and let me > know if my concerns are valid: > > 1. FF-A notification IRQ: after a CPU down->up cycle the IRQ > configuration may be lost.
OPTEE and FFA are marked as unsupported. > 2. GICv3 LPIs: a CPU may fail to come back up unless its LPI pending > table exists (is allocated) on bring-up. See > gicv3_lpi_allocate_pendtable() and its call chain. ITS is marked as unsupported. I have a plan to deal with this, but it is out of scope of this series. > 3. IRQ migration on CPU down: if an IRQ targets a CPU being offlined, > its affinity should be moved to an online CPU before completing the > offlining. All guest tied IRQ migration is handled by the scheduler. Regarding the irqs used by Xen, I didn't find any with affinity to other CPUs than CPU 0, which can't be disabled. I think theoretically it is possible for them to have different affinity, but it seems unlikely considering that x86 hotplug code also doesn't seem to do any Xen irq migration AFAIU. > 4. Race between the new hypercalls and disable/enable_nonboot_cpus(): > disable_nonboot_cpus is called, enable_nonboot_cpus() reads > frozen_cpus, and before it calls cpu_up() a hypercall onlines the CPU. > cpu_up() then fails as "already online", but the CPU_RESUME_FAILED > path may still run for an already-online CPU, risking use-after-free > of per-CPU state (e.g. via free_percpu_area()) and other issues > related to CPU_RESUME_FAILED notification. > There don't seem to be any calls to disable/enable_nonboot_cpus() on Arm. If we take x86 as an example, then they are called with all domains already paused, and I don't see how paused domains can issue hypercalls. > > Best regards, > Mykola -- Mykyta
