On 23.06.2022 11:01, Juergen Gross wrote:
> On 23.06.22 10:47, Jan Beulich wrote:
>> On 23.06.2022 10:06, Juergen Gross wrote:
>>> On 23.06.22 09:55, Jan Beulich wrote:
>>>> On 22.06.2022 18:06, Juergen Gross wrote:
>>>>> A Linux kernel 5.19 can only be loaded as dom0, if it has been
>>>>> built with CONFIG_AMD_MEM_ENCRYPT enabled. This is due to the
>>>>> fact that otherwise the (relevant) last section of the built
>>>>> kernel has the NOLOAD flag set (it is still marked with
>>>>> SHF_ALLOC).
>>>>>
>>>>> I think at least the hypervisor needs to be changed to support
>>>>> this layout. Otherwise it will put the initial page tables for
>>>>> dom0 at the same position as this last section, leading to
>>>>> early crashes.
>>>>
>>>> Isn't Xen using the bzImage header there, rather than any ELF
>>>> one? In which case it would matter how the NOLOAD section is
>>>
>>> For a PV kernel? No, I don't think so.
>>
>> Actually it's a mix (and the same for PV and PVH) - the bzImage
>> header is parsed to get at the embedded ELF header. XenoLinux was
>> what would/could be loaded as plain ELF.
>>
>>>> actually represented in that header. Can you provide a dump (or
>>>> binary representation) of both headers?
>>>
>>> Program Header:
>>>       LOAD off    0x0000000000200000 vaddr 0xffffffff81000000 paddr
>>> 0x0000000001000000 align 2**21
>>>            filesz 0x000000000145e114 memsz 0x000000000145e114 flags r-x
>>>       LOAD off    0x0000000001800000 vaddr 0xffffffff82600000 paddr
>>> 0x0000000002600000 align 2**21
>>>            filesz 0x00000000006b7000 memsz 0x00000000006b7000 flags rw-
>>>       LOAD off    0x0000000002000000 vaddr 0x0000000000000000 paddr
>>> 0x0000000002cb7000 align 2**21
>>>            filesz 0x00000000000312a8 memsz 0x00000000000312a8 flags rw-
>>>       LOAD off    0x00000000020e9000 vaddr 0xffffffff82ce9000 paddr
>>> 0x0000000002ce9000 align 2**21
>>>            filesz 0x00000000001fd000 memsz 0x0000000000317000 flags rwx
>>
>> 20e9000 + 317000 = 240000
>>
>>>       NOTE off    0x000000000165df10 vaddr 0xffffffff8245df10 paddr
>>> 0x000000000245df10 align 2**2
>>>            filesz 0x0000000000000204 memsz 0x0000000000000204 flags ---
>>>
>>>
>>> Sections:
>>> Idx Name          Size      VMA               LMA               File off  
>>> Algn
>>> ...
>>>    30 .smp_locks    00009000  ffffffff82edc000  0000000002edc000  022dc000  
>>> 2**2
>>>                     CONTENTS, ALLOC, LOAD, READONLY, DATA
>>>    31 .data_nosave  00001000  ffffffff82ee5000  0000000002ee5000  022e5000  
>>> 2**2
>>>                     CONTENTS, ALLOC, LOAD, DATA
>>>    32 .bss          0011a000  ffffffff82ee6000  0000000002ee6000  022e6000  
>>> 2**12
>>>                     ALLOC
>>
>> 2ee6000 + 11a000 = 240000
>>
>>>    33 .brk          00026000  ffffffff83000000  ffffffff83000000  00000000  
>>> 2**0
>>>                     ALLOC
>>
>> This space isn't covered by any program header. Which in turn may be a
>> result of its LMA matching its VMA, unlike for all other sections.
>> Looks like a linker script or linker issue to me: While ...
>>
>>> And the related linker script part:
>>>
>>>           __end_of_kernel_reserve = .;
>>>
>>>           . = ALIGN(PAGE_SIZE);
>>>           .brk (NOLOAD) : AT(ADDR(.brk) - LOAD_OFFSET) {
>>
>> ... this AT() looks correct to me, I'm uncertain of the use of NOLOAD.
>> Note that .bss doesn't have NOLOAD, matching the vast majority of the
>> linker scripts ld itself has.
> 
> Yeah, but the filesz and memsz values of the .bss related program header
> differ a lot (basically by the .bss size plus some alignment),

That's the very nature of .bss - no data to be loaded from the file.

> and the
> .bss section flags clearly say that its attributes match those of .brk.
> 
> I'm not sure why the linker wouldn't add .brk to the same pgrogram
> header entry as .bss, but maybe that is some .bss special handling.

I don't know either, but I suspect this to be an effect of using NOLOAD
(without meaning to decide yet whether it's a wrong use of the
attribute or bad handling of it in ld).

> In the end I think this might be a linker issue, but even in this case
> we should really consider to handle it, as otherwise we'd just say
> "hey, due to a linker problem we don't support Linux 5.19 in PV mode".
> 
> In the end we can't control which linker versions are used to link
> the kernel.

Right, but the workaround for such a linker issue (if any) would better
live in Linux 5.19.

Jan

Reply via email to