On 16/02/18 20:02, Andrew Cooper wrote:
> On 16/02/18 18:51, Marek Marczykowski-Górecki wrote:
>> On Fri, Feb 16, 2018 at 05:52:50PM +0000, Andrew Cooper wrote:
>>> On 16/02/18 17:48, Marek Marczykowski-Górecki wrote:
>>>> As in the subject, the guest crashes on boot, before kernel output
>>>> anything. I've isolated this to the conditions below:
>>>> - PV guest have PCI device assigned (e1000e emulated by QEMU in this
>>>> without PCI device it works
>>>> - Xen (in KVM) is started through OVMF; with seabios it works
>>>> - nested HVM is disabled in KVM
>>>> - AMD IOMMU emulation is disabled in KVM; when enabled qemu crashes on
>>>> boot (looks like qemu bug, unrelated to this one)
>>>> Version info:
>>>> - KVM host: OpenSUSE 42.3, qemu 2.9.1,
>>>> ovmf-2017+git1492060560.b6d11d7c46-4.1, AMD
>>>> - Xen host: Xen 4.8.3, dom0: Linux 4.14.13
>>>> - Xen domU: Linux 4.14.13, direct boot
>>>> Not sure if relevant, but initially I've tried booting xen.efi /mapbs
>>>> /noexitboot and then dom0 kernel crashed saying something about conflict
>>>> between e820 and kernel mapping. But now those options are disabled.
>>>> The crash message:
>>>> (XEN) d1v0 Unhandled invalid opcode fault/trap [#6, ec=0000]
>>>> (XEN) domain_crash_sync called from entry.S: fault at ffff82d080218720
>>>> (XEN) Domain 1 (vcpu#0) crashed on cpu#1:
>>>> (XEN) ----[ Xen-4.8.3 x86_64 debug=n Not tainted ]----
>>>> (XEN) CPU: 1
>>>> (XEN) RIP: e033:[<ffffffff826d9156>]
>>> This is #UD, which is most probably hitting a BUG(). addr2line this ^
>>> to find some code to look at.
>> addr2line failed me
> By default, vmlinux is stripped and compressed. Ideally you want to
> addr2line the vmlinux artefact in the root of your kernel build, which
> is the plain elf with debugging symbols.
> Alternatively, use scripts/extract-vmlinux on the binary you actually
> booted, which might get you somewhere.
>> , but System.map says its xen_memory_setup. And it
>> looks like the BUG() is the same as I had in dom0 before:
>> "Xen hypervisor allocated kernel memory conflicts with E820 map".
> Juergen: Is there anything we can do to try and insert some dummy
> exception handlers right at PV start, so we could at least print out a
> oneliner to the host console which is a little more helpful than Xen
> saying "something unknown went wrong" ?
You mean something like commit 42b3a4cb5609de757f5445fcad18945ba9239a07
added to kernel 4.15?
>> Disabling e820_host in guest config solved the problem. Thanks!
>> Is this some bug in Xen or OVMF, or is it expected behavior and e820_host
>> should be avoided?
> I don't really know. e820_host is a gross hack which shouldn't really
> be present. The actually problem is that Linux can't cope with the
> memory layout it was given (and I can't recall if there is anything
> Linux could potentially to do cope). OTOH, the toolstack, which knew
> about e820_host and chose to lay the guest out in an overlapping way is
> probably also at fault.
The kernel can cope with lots of E820 scenarios (e.g. by relocating
initrd or the p2m map), but moving itself out of the way is not
Xen-devel mailing list