Hi,
On 14/09/2023 03:17, Leo Yan wrote:
On ADLink AVA platform (Ampere Altra SoC with 32 Arm Neoverse N1 cores),
the physical memory regions are:
DRAM memory regions:
Node[0] Region[0]: 0x000080000000 - 0x0000ffffffff
Node[0] Region[1]: 0x080000000000 - 0x08007fffffff
Node[0] Region[2]: 0x080100000000 - 0x0807ffffffff
The UEFI loads Xen hypervisor and DTB into the high memory, the kernel
and ramdisk images are loaded into the low memory space:
(XEN) MODULE[0]: 00000807f6df0000 - 00000807f6f3e000 Xen
(XEN) MODULE[1]: 00000807f8054000 - 00000807f8056000 Device Tree
(XEN) MODULE[2]: 00000000fa834000 - 00000000fc5de1d5 Ramdisk
(XEN) MODULE[3]: 00000000fc5df000 - 00000000ffb3f810 Kernel
In this case, the Xen binary is loaded above 8TB, which exceeds the
maximum supported identity map space of 2TB in Xen. Consequently, the
system fails to boot.
This patch enlarges identity map space to 127TB, allowing module loading
within the range of [0x0 .. 0x00007eff_ffff_ffff].
On v2 you wrote:
"
When I reviewed the existed code, I found it reserves 125TiB:
0x0000028000000000 - 0x00007fffffffffff (125TB, L0 slots [5..255])
Unused
Seems to me, we can map this area. Ideally, if we only map for the
first level's page table, we can just fill the zeroeth page and don't
need to allocate extra page tables.
"
I agree that we will not allocate page-tables for the whole reserved
region. However, my concern was more related to the fact that it would
be more difficult to reclaim space in the virtual address if necessary
in the future.
So I would rather prefer if we don't use the whole 127 TiB if this is
not necessary. For your platform, it seems that it would be enough to
bump the area to 10 TB (this is 8TB + some margin).
Note, despite this expansion of the identity map to 127TB, the frame
table still only supports 2TB. The reason is the frame table is data
structure for the page management, which does not require coverage of
the memory layout gaps (refer to pfn_pdx_hole_setup() for Xen removing
the biggest gap from memory regions).
This is not quite correct. The PDX can only compress the bottom bits (if
they are all zeroes) and one region in the address. So some holes may be
covered.
It might be possible that for your platform, the compression is enough
to fit everything in 2TB.
But I would drop this paragraph. The decision to enlarge the identity
mapping is different from the size of frametable. You may have a
platform where the first RAM bank is high in memory (such as on AMD
Seattle). There might also be some changes necessary in Xen to support
more than 2TB frametable.
Cheers,
--
Julien Grall