On 12/12/2023 9:43 am, James Dingwall wrote: > Hi, > > We were experiencing a crash during PV domU boot on several different models > of hardware but all with Intel CPUs. The Xen version was based on stable-4.15 > at 4a4daf6bddbe8a741329df5cc8768f7dec664aed (XSA-444) with some local > patches. Since updating the branch to > b918c4cdc7ab2c1c9e9a9b54fa9d9c595913e028 > (XSA-446) we have not observed the same crash.
That range covers: 1f5f515da0f6 - iommu/amd-vi: use correct level for quarantine domain page tables b918c4cdc7ab - x86/spec-ctrl: Remove conditional IRQs-on-ness for INT $0x80/0x82 paths so yeah - not much in the way of change. > The occurrence was on 1-2% of boots and we couldn't determine a particular > sequence of events that would trigger it. The kernel is based on Ubuntu's > 5.15.0-91 tag but we also observed the same with -85. Due to the low > frequency it is possible that we simply haven't observed it again since > updating our Xen build. > > If I have followed the early startup this is happening shortly after detection > of possible CPU vulnerabilities and patching in alternative instructions. As > the RIP was native_irq_return_iret and XSA-446 related to interupt management > I wondered if it was possible that despite "Xen is not believed to be > vulnerable > in default configurations on CPUs from other hardware vendors." there could > be some conditions in which an Intel CPU is affected? In short, XSA-446 isn't plausibly related. It's completely internal to Xen, with no alteration on guest state. It is an error that Linux has ended up in native_irq_return_iret. Linux cannot return to itself with an IRET instruction, and must use HYPERCALL_iret instead. In recent versions of Linux, this is fixed up as about the earliest action a PV kernel takes, but on older versions of Linux, any interrupt/exception early enough on boot was fatal in this way. This part of the backtrace is odd: [ 0.398962] ? native_iret+0x7/0x7 [ 0.398967] ? insn_decode+0x79/0x100 [ 0.398975] ? insn_decode+0xcf/0x100 [ 0.398980] optimize_nops+0x68/0x150 as it's not clear how we've ended up in a case wanting to return back to the kernel to begin with. However, it's most likely a pagefault, as optimize_nops() is making changes in arbitrary locations. It is possible that a change in visible features has altered the behaviour enough not to crash, but if everything is still the same as far as you can tell, then it's likely just chance that you haven't seen it again. This is definitely a Linux bug, so I suspect something bad has been backported into Ubuntu. ~Andrew
