On 04.03.2024 11:02, Roger Pau Monné wrote:
> On Mon, Mar 04, 2024 at 08:32:22AM +0100, Jan Beulich wrote:
>> BARs of size 2Gb and up can't possibly fit below 4Gb: Both the bottom of
>> the lower 2Gb range and the top of the higher 2Gb range have special
>> purpose. Don't even have them influence whether to (perhaps) relocate
>> low RAM.
> 
> Here you mention 2Gb BARs, yet the code below sets the maximum BAR
> size supported below 4Gb to 1Gb.

Hmm, I'm puzzled: There are no other BAR sizes between 1Gb and 2Gb.
Anything up to 1Gb continues to work as is, while everything 2Gb and
up has behavior changed.

>> --- a/tools/firmware/hvmloader/pci.c
>> +++ b/tools/firmware/hvmloader/pci.c
>> @@ -33,6 +33,13 @@ uint32_t pci_mem_start = HVM_BELOW_4G_MM
>>  const uint32_t pci_mem_end = RESERVED_MEMBASE;
>>  uint64_t pci_hi_mem_start = 0, pci_hi_mem_end = 0;
>>  
>> +/*
>> + * BARs larger than this value are put in 64-bit space unconditionally.  
>> That
>> + * is, such BARs also don't play into the determination of how big the 
>> lowmem
>> + * MMIO hole needs to be.
>> + */
>> +#define HUGE_BAR_THRESH GB(1)
> 
> I would be fine with defining this to an even lower number, like
> 256Mb, as to avoid as much as possible memory relocation in order to
> make the MMIO hole bigger.

As suggested in a post-commit-message remark, the main question then is
how to justify this.

>> @@ -367,7 +376,7 @@ void pci_setup(void)
>>              pci_mem_start = hvm_info->low_mem_pgend << PAGE_SHIFT;
>>      }
>>  
>> -    if ( mmio_total > (pci_mem_end - pci_mem_start) )
>> +    if ( mmio_total > (pci_mem_end - pci_mem_start) || bar64_relocate )
>>      {
>>          printf("Low MMIO hole not large enough for all devices,"
>>                 " relocating some BARs to 64-bit\n");
> 
> Is the above message now accurate?  Given the current code the low
> MMIO could be expanded up to 2Gb, yet BAR relocation will happen
> unconditionally once a 1Gb BAR is found.

Well, "all" may not be quite accurate anymore, yet would making it e.g.
"all applicable" really much more meaningful?

>> @@ -446,8 +455,9 @@ void pci_setup(void)
>>           *   the code here assumes it to be.)
>>           * Should either of those two conditions change, this code will 
>> break.
>>           */
>> -        using_64bar = bars[i].is_64bar && bar64_relocate
>> -            && (mmio_total > (mem_resource.max - mem_resource.base));
>> +        using_64bar = bars[i].is_64bar && bar64_relocate &&
>> +            (mmio_total > (mem_resource.max - mem_resource.base) ||
>> +             bar_sz > HUGE_BAR_THRESH);
>>          bar_data = pci_readl(devfn, bar_reg);
>>  
>>          if ( (bar_data & PCI_BASE_ADDRESS_SPACE) ==
>> @@ -467,7 +477,8 @@ void pci_setup(void)
>>                  resource = &mem_resource;
>>                  bar_data &= ~PCI_BASE_ADDRESS_MEM_MASK;
>>              }
>> -            mmio_total -= bar_sz;
>> +            if ( bar_sz <= HUGE_BAR_THRESH )
>> +                mmio_total -= bar_sz;
> 
> I'm missing the part where hvmloader notifies QEMU of the possibly
> expanded base and size memory PCI MMIO regions, so that those are
> reflected in the PCI root complex registers?

I don't understand this comment: I'm not changing the interaction
with qemu at all. Whatever the new calculation it'll be communicated
to qemu just as before.

> Overall I think we could simplify the code by having a hardcoded 1Gb
> PCI MMIO hole below 4Gb, fill it with all the 32bit BARs and
> (re)locate all 64bit BARs above 4Gb (not that I'm requesting you to do
> it here).

I'm afraid that would not work very well with OSes which aren't 64-bit
BAR / PA aware (first and foremost non-PAE 32-bit ones). Iirc that's
the reason why it wasn't done like you suggest back at the time.

Jan

Reply via email to