On 03.12.25 23:10, Julien Grall wrote:
> Hi,

Hello Julien

> 
> On 03/12/2025 18:58, Oleksandr Tyshchenko wrote:
>> Creating a dom0less guest with a high vCPU count (e.g., >32) fails
>> because the fixed 4KiB device tree buffer (DOMU_DTB_SIZE) overflows
>> during creation.
>>
>> The FDT nodes for each vCPU are the primary consumer of space,
>> and the previous fixed-size buffer was insufficient.
>>
>> This patch replaces the fixed size with a formula that calculates
>> the required buffer size based on a fixed baseline plus a scalable
>> portion for each potential vCPU up to the MAX_VIRT_CPUS limit.
>>
>> Signed-off-by: Oleksandr Tyshchenko <[email protected]>
>> ---
>> V1: https://eur01.safelinks.protection.outlook.com/? 
>> url=https%3A%2F%2Fpatchew.org%2FXen%2F20251202193246.3357821-1- 
>> oleksandr._5Ftyshchenko%40epam.com%2F&data=05%7C02%7COleksandr_Tyshchenko%40epam.com%7C57bf7711ac4747de3d2f08de32b069ce%7Cb41b72d04e9f4c268a69f949f367c91d%7C1%7C0%7C639003930443970639%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=u6pp39%2FVto2vU7Hp5aXl46VF4zDvD8C79Xp09bbowS4%3D&reserved=0
>>
>>    V2:
>>     - update commit subj/desc
>>     - use a formula that accounts MAX_VIRT_CPUS
>>     - add BUILD_BUG_ON
>> ---
>> ---
>>   xen/common/device-tree/dom0less-build.c | 16 +++++++++++++---
>>   1 file changed, 13 insertions(+), 3 deletions(-)
>>
>> diff --git a/xen/common/device-tree/dom0less-build.c b/xen/common/ 
>> device-tree/dom0less-build.c
>> index 3f5b987ed8..38a5830813 100644
>> --- a/xen/common/device-tree/dom0less-build.c
>> +++ b/xen/common/device-tree/dom0less-build.c
>> @@ -461,15 +461,25 @@ static int __init 
>> domain_handle_dtb_boot_module(struct domain *d,
>>   /*
>>    * The max size for DT is 2MB. However, the generated DT is small 
>> (not including
>> - * domU passthrough DT nodes whose size we account separately), 4KB 
>> are enough
>> - * for now, but we might have to increase it in the future.
>> + * domU passthrough DT nodes whose size we account separately). The 
>> size is
>> + * calculated from a fixed baseline plus a scalable portion for each 
>> potential
>> + * vCPU node up to the system limit (MAX_VIRT_CPUS), as the vCPU 
>> nodes are
>> + * the primary consumer of space.
>> + *
>> + * The baseline of 2KiB is a safe buffer for all non-vCPU FDT content.
> 
> What if the use decides to pass a DTB fragment? How do we know this will 
> fit in the 2KiB?

If a partial device tree is provided then it will be accounted 
separately. There is a code, non-visible is the context, so I think, we 
are good here.

     /* Account for domU passthrough DT size */
     if ( kinfo->dtb )
         fdt_size += kinfo->dtb->size;


> 
>> + * Empirical testing with the maximum number of other device tree 
>> nodes shows
>> + * a final compacted base size of ~1.5KiB. The 128 bytes per vCPU is 
>> derived
>> + * from a worst-case analysis of the FDT construction-time size for a 
>> single
>> + * vCPU node.
> 
> For in-code documentation, this is ok to just provide some numbers. But 
> this needs a bit more details in the commit message with the exact tests 
> you did. This would be helpful if we ever need to change the size (for 
> instance we could have extra emulated devices or we need another 
> property per CPU).

ok, I will add my testing details into the commit description.

> 
>>    */
>> -#define DOMU_DTB_SIZE 4096
>> +#define DOMU_DTB_SIZE (2048 + (MAX_VIRT_CPUS * 128))
> 
> On Arm32, MAX_VIRT_CPUS is 8. This means the new DOMU_DTB_SIZE is going 
> to be smaller than 4096. Why is it ok?

You are correct to question the impact on Arm32, where MAX_VIRT_CPUS is 
smaller, leading to a calculated buffer size of 3072 bytes, which is 
less than the original 4096 bytes.

Unfortunately, I have no possibility to test on Arm32. But, I do not see 
much difference between Arm64 and Arm32 in the context of DomU device 
tree generation by looking into the code.

I simulated this exact environment on my Arm64 setup to validate that 
the new size remains sufficient. To do this, I temporarily switched 
MAX_VIRT_CPUS to 8 and ran tests with 1 and 8 vCPUs.


diff --git a/xen/common/device-tree/dom0less-build.c 
b/xen/common/device-tree/dom0less-build.c
index 38a5830813..0c64b9dfb7 100644
--- a/xen/common/device-tree/dom0less-build.c
+++ b/xen/common/device-tree/dom0less-build.c
@@ -472,7 +472,7 @@ static int __init 
domain_handle_dtb_boot_module(struct domain *d,
   * from a worst-case analysis of the FDT construction-time size for a 
single
   * vCPU node.
   */
-#define DOMU_DTB_SIZE (2048 + (MAX_VIRT_CPUS * 128))
+#define DOMU_DTB_SIZE (2048 + (8 * 128))
  static int __init prepare_dtb_domU(struct domain *d, struct 
kernel_info *kinfo)
  {
      int addrcells, sizecells;
@@ -577,6 +577,9 @@ static int __init prepare_dtb_domU(struct domain *d, 
struct kernel_info *kinfo)
      if ( ret < 0 )
          goto err;

+    printk("Final compacted FDT size is: %d bytes\n", 
fdt_totalsize(kinfo->fdt));
+    printk("Predefined FDT size is: %d bytes\n", DOMU_DTB_SIZE);
+
      return 0;

    err:
(END)



cpus=1
(XEN) Final compacted FDT size is: 1586 bytes
(XEN) Predefined FDT size is: 3072 bytes

cpus=8
(XEN) Final compacted FDT size is: 2370 bytes
(XEN) Predefined FDT size is: 3072 bytes

Also, if I understand the code correctly, on Arm32 the "enable-method = 
"psci" is not added to the generated device tree, so Arm32's vCPU node 
would require less space.

     if ( is_64bit_domain(d) )
     {
         res = fdt_property_string(fdt, "enable-method", "psci");
         if ( res )
             return res;
     }



> 
>>   static int __init prepare_dtb_domU(struct domain *d, struct 
>> kernel_info *kinfo)
>>   {
>>       int addrcells, sizecells;
>>       int ret, fdt_size = DOMU_DTB_SIZE;
>> +    BUILD_BUG_ON(DOMU_DTB_SIZE > SZ_2M);
>> +
>>       kinfo->phandle_intc = GUEST_PHANDLE_GIC;
>>   #ifdef CONFIG_GRANT_TABLE
> 
> Cheers,
> 

Reply via email to