On 13.05.2025 21:20, Marek Marczykowski-Górecki wrote:
> Hi,
> 
> When debugging CI job on Linus' master branch, I added "console=vga 
> vga=,keep" and got PV dom0 crash Xen with:
> 
> (XEN) [   40.870435] Assertion 'desc->arch.creator_domid == DOMID_INVALID' 
> failed at arch/x86/irq.c:294
> (XEN) [   40.886925] ----[ Xen-4.21-unstable  x86_64  debug=y ubsan=y  Not 
> tainted ]----
> (XEN) [   40.903356] CPU:    10
> (XEN) [   40.919824] RIP:    e008:[<ffff82d04059d2ed>] create_irq+0x272/0x339
> (XEN) [   40.936267] RFLAGS: 0000000000010297   CONTEXT: hypervisor (d0v13)
> (XEN) [   40.952709] rax: 00000000fffffff4   rbx: ffff830498007c00   rcx: 
> 0000000000001899

There looks to be a -ENOMEM in %rax, so ...

> (XEN) [   40.969147] rdx: ffff830498007c5e   rsi: 0000000000000002   rdi: 
> ffff83049830e398
> (XEN) [   40.985892] rbp: ffff830498307d18   rsp: ffff830498307ce8   r8:  
> 0000000000000000
> (XEN) [   41.003093] r9:  0000000000000000   r10: 0000000000000000   r11: 
> 0000000000000000
> (XEN) [   41.020279] r12: 00000000fffffff4   r13: 000000000000007c   r14: 
> ffffffffffffffff
> (XEN) [   41.037489] r15: 000000000000007c   cr0: 0000000080050033   cr4: 
> 0000000000b526e0
> (XEN) [   41.054699] cr3: 0000000492c34000   cr2: ffff8881001603f0
> (XEN) [   41.071904] fsb: 0000000000000000   gsb: ffff8882365ac000   gss: 
> 0000000000000000
> (XEN) [   41.089116] ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   
> cs: e008
> (XEN) [   41.106320] Xen code around <ffff82d04059d2ed> 
> (create_irq+0x272/0x339):
> (XEN) [   41.123521]  3f d9 ff e9 cc fe ff ff <0f> 0b 48 8d 3d 5a a0 29 00 e8 
> f4 3d d9 ff c6 43
> (XEN) [   41.140739] Xen stack trace from rsp=ffff830498307ce8:
> (XEN) [   41.157931]    000000ff00000001 ffff830497faa000 ffff830498307e00 
> 00000000ffffffff
> (XEN) [   41.175132]    0000000000010000 ffff830497faa160 ffff830498307d70 
> ffff82d0405a6f85
> (XEN) [   41.192351]    00000000000002a0 ffff830498307e24 0000000000000200 
> 00000000ffffffff
> (XEN) [   41.209551]    ffff830497faa000 0000000000000000 ffff830497faa168 
> 0000000000010000
> (XEN) [   41.226753]    ffff830497faa160 ffff830498307de0 ffff82d0405c9ea6 
> 5f24a0ddbbeda194
> (XEN) [   41.244062]    ffff830498307e10 0000000000000000 0000000000000001 
> ffff830498307e00
> (XEN) [   41.261387]    ffff830498307e24 ffff830498307e20 ffff830497faa000 
> ffff830498307ef8
> (XEN) [   41.278730]    ffff830497faa000 ffff830497f5a000 ffffc9004002ba78 
> ffff830498307e68
> (XEN) [   41.296052]    ffff82d0405cbd4f ffff82d04053fc3e ffffc9004002ba78 
> 00000000000000a0
> (XEN) [   41.313381]    00a0fb0000000001 0000000000000000 0000000000007ff0 
> ffffffffffffffff
> (XEN) [   41.330708]    000000a000000000 0000000000000000 0000000000000000 
> ffff830498307ef8
> (XEN) [   41.348026]    ffff830497f5a000 0000000000000021 0000000000000000 
> ffffc9004002ba78
> (XEN) [   41.365357]    ffff830498307ee8 ffff82d0405427db ffff8881d6961b40 
> 0000000000000001
> (XEN) [   41.382680]    000000a000000000 000000000000000d 0000000000000000 
> ffff830498307ee8
> (XEN) [   41.400003]    ffff82d0405e7bc2 ffff830497f5a000 0000000000000000 
> ffff830497f5a000
> (XEN) [   41.417343]    0000000000000000 0000000000000000 ffff830498307fff 
> 0000000000000000
> (XEN) [   41.434674]    00007cfb67cf80e7 ffff82d0402012d3 ffff8881d6961b40 
> ffff888100ef30c0
> (XEN) [   41.452010]    0000000000000001 0000000000000005 0000000000000000 
> ffff888100ef3000
> (XEN) [   41.469342]    0000000000000202 0000000000000001 0000000000007ff0 
> ffff8881d6961b40
> (XEN) [   41.486681]    0000000000000021 ffffffff8229d355 000000a000000000 
> ffffc9004002ba78
> (XEN) [   41.504015] Xen call trace:
> (XEN) [   41.521314]    [<ffff82d04059d2ed>] R create_irq+0x272/0x339

... I'd expect the function calling init_one_irq_desc() to have caused this.
In which case, yes, the assertion is certainly valid to trigger (as it's
arch_init_one_irq_desc() which sets the field to the expected value, yet
that won't happen if one of the involved allocations fails). I'll make a
patch, but this raises the question of how you're running Xen, when
seemingly small allocations like the ones involved here end up failing.

Jan

Reply via email to