On 16.04.25 14:07, Petr Vaněk wrote:
Hi all,I have discovered a regression introduced in commit a9b3c355c2e6 ("asm-generic: pgalloc: provide generic __pgd_{alloc,free}") [1,2] in kernel version 6.14. The problem occurs when the x86 kernel is configured with CONFIG_DEBUG_VM_PGFLAGS=y and is run as a PV Dom0 in Xen 4.19.1. During the startup, the kernel panics with the error log below. The commit changed PGD allocation path. In the new implementation _pgd_alloc allocates memory with __pgd_alloc, which indirectly calls alloc_pages_noprof(gfp | __GFP_COMP, order); This is in contrast to the old behavior, where __get_free_pages was used, which indirectly called alloc_pages_noprof(gfp_mask & ~__GFP_HIGHMEM, order); The key difference is that the new allocator can return a compound page. When xen_pin_page is later called on such a page, it call TestSetPagePinned function, which internally uses the PF_NO_COMPOUND macro. This macro enforces VM_BUG_ON_PGFLAGS if PageCompound is true, triggering the panic when CONFIG_DEBUG_VM_PGFLAGS is enabled. I am reporting this issue without a patch as I am not sure which part of the code should be adapted to resolve the regression.
Thanks for the report AND the analysis. I believe PMD_ALLOCATION_ORDER needs to be changed: in case the system is running as a Xen PV domain (or with PTI disabled), PMD_ALLOCATION_ORDER should be 0. So I'd suggest to switch PGD_ALLOCATION_ORDER to be defined either as 0 (in case PTI is not configured), or pgd_allocation_order (a new global variable having the value 0 or 1, depending on PTI active or not). I'll send a patch. Juergen
OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key
OpenPGP_signature.asc
Description: OpenPGP digital signature