On 07.09.21 19:30, Jan Beulich wrote:
> On 07.09.2021 15:33, Oleksandr Andrushchenko wrote:
>> On 06.09.21 17:31, Jan Beulich wrote:
>>> On 03.09.2021 12:08, Oleksandr Andrushchenko wrote:
>>>> --- a/xen/drivers/vpci/header.c
>>>> +++ b/xen/drivers/vpci/header.c
>>>> @@ -400,12 +400,72 @@ static void bar_write(const struct pci_dev *pdev, 
>>>> unsigned int reg,
>>>>    static void guest_bar_write(const struct pci_dev *pdev, unsigned int 
>>>> reg,
>>>>                                uint32_t val, void *data)
>>>>    {
>>>> +    struct vpci_bar *bar = data;
>>>> +    bool hi = false;
>>>> +
>>>> +    if ( bar->type == VPCI_BAR_MEM64_HI )
>>>> +    {
>>>> +        ASSERT(reg > PCI_BASE_ADDRESS_0);
>>>> +        bar--;
>>>> +        hi = true;
>>>> +    }
>>>> +    else
>>>> +        val &= PCI_BASE_ADDRESS_MEM_MASK;
>>>> +    bar->guest_addr &= ~(0xffffffffull << (hi ? 32 : 0));
>>>> +    bar->guest_addr |= (uint64_t)val << (hi ? 32 : 0);
>>> What you store here is not the address that's going to be used,
>> bar->guest_addr is never used directly to be reported to a guest.
> And this is what I question, as an approach. Your model _may_ work,
> but its needlessly deviating from how people like me would expect
> this to work. And if it's intended to be this way, how would I
> have known?
Well, I just tried to follow the model we already have in the existing code...
>
>> It is always used as an initial value which is then modified to reflect
>> lower bits, e.g. BAR type and if prefetchable, so I think this is perfectly
>> fine to have it this way.
> And it is also perfectly fine to store the value to be handed
> back to guests on the next read. Keeps the read path simple,
> which I think can be assumed to be taken more frequently than
> the write one. Plus stored values reflect reality.
>
> Plus - if what you say was really the case, why do you mask off
> PCI_BASE_ADDRESS_MEM_MASK here? You should then simply store
> the written value and do _all_ the processing in the read path.
> No point having partial logic in two places.

I will move the logic to the write handler, indeed we can save some

CPU cycles here

>
>>>    as
>>> you don't mask off the low bits (to account for the BAR's size).
>>> When a BAR gets written with all ones, all writable bits get these
>>> ones stored. The address of the BAR, aiui, really changes to
>>> (typically) close below 4Gb (in the case of a 32-bit BAR), which
>>> is why memory / I/O decoding should be off while sizing BARs.
>>> Therefore you shouldn't look for the specific "all writable bits
>>> are ones" pattern (or worse, as you presently do, the "all bits
>>> outside of the type specifier are ones" one) on the read path.
>>> Instead mask the value appropriately here, and simply return back
>>> the stored value from the read path.
>> "PCI LOCAL BUS SPECIFICATION, REV. 3.0", "IMPLEMENTATION NOTE
>>
>> Sizing a 32-bit Base Address Register Example" says, that
>>
>> "Software saves the original value of the Base Address register, writes
>> 0 FFFF FFFFh to the register, then reads it back."
>>
>> The same applies for 64-bit BARs. So what's wrong if I try to catch such
>> a write when a guest tries to size the BAR? The only difference is that
>> I compare as
>>           if ( (val & PCI_BASE_ADDRESS_MEM_MASK_32) == 
>> PCI_BASE_ADDRESS_MEM_MASK_32 )
>> which is because val in the question has lower bits cleared.
> Because while this matches what the spec says, it's not enough to
> match how hardware behaves.
But we should implement as the spec says, not like buggy hardware behaves
>   Yet you want to mimic hardware behavior
> as closely as possible here. There is (iirc) at least one other
> place in the source tree were we had to adjust a similar initial
> implementation to be closer to one expected by guests,

Could you please point me to that code so I can consult and possibly

re-use the approach?

>   no matter
> that they may not be following the spec to the letter. Don't
> forget that there may be bugs in kernels which don't surface until
> the kernel gets run on an overly simplistic emulation.
This is sad. And the kernel needs to be fixed then, not Xen
>
>>>> @@ -522,6 +582,13 @@ static int add_bar_handlers(struct pci_dev *pdev, 
>>>> bool is_hwdom)
>>>>                if ( rc )
>>>>                    return rc;
>>>>            }
>>>> +        /*
>>>> +         * It is neither safe nor secure to initialize guest's view of 
>>>> the BARs
>>>> +         * with real values which are used by the hardware domain, so 
>>>> assign
>>>> +         * all zeros to guest's view of the BARs, so the guest can perform
>>>> +         * proper PCI device enumeration and assign BARs on its own.
>>>> +         */
>>>> +        bars[i].guest_addr = 0;
>>> I'm afraid I don't understand the comment: Without memory decoding
>>> enabled, the BARs are simple registers (with a few r/o bits).
>> My first implementation was that bar->guest_addr was initialized with
>> the value of bar->addr (physical BAR value), but talking on IRC with
>> Roger he suggested that this might be a security issue to let guest
>> a hint about physical values, so then I changed the assignment to be 0.
> Well, yes, that's certainly correct. It's perhaps too unobvious to me
> why one may want to use the host value here in the first place. It
> simply has no meaning here.
Do you want me to remove the comment?
>
>>>> --- a/xen/include/xen/pci_regs.h
>>>> +++ b/xen/include/xen/pci_regs.h
>>>> @@ -103,6 +103,7 @@
>>>>    #define  PCI_BASE_ADDRESS_MEM_TYPE_64   0x04    /* 64 bit address */
>>>>    #define  PCI_BASE_ADDRESS_MEM_PREFETCH  0x08    /* prefetchable? */
>>>>    #define  PCI_BASE_ADDRESS_MEM_MASK      (~0x0fUL)
>>>> +#define  PCI_BASE_ADDRESS_MEM_MASK_32     (~0x0fU)
>>> Please don't introduce an identical constant that's merely of
>>> different type. (uint32_t)PCI_BASE_ADDRESS_MEM_MASK at the use
>>> site (if actually still needed as per the comment above) would
>>> seem more clear to me.
>> Ok, I thought type casting is a bigger evil here
> Often it is, but imo here it is not. I hope you realize that ~0x0fU
> if not necessarily 0xfffffff0? We make certain assumptions on type
> widths. For unsigned int we assume it to be at least 32 bits wide.
> We should avoid assumptions of it being exactly 32 bits wide. Just
> like we cannot (more obviously) assume the width of unsigned long.
> (Which tells us that for 32-bit arches PCI_BASE_ADDRESS_MEM_MASK is
> actually of the wrong type. This constant should be the same no
> matter the bitness.)
Ok, I will not introduce a new define and use (uint32_t)
>
> Jan
>
Thank you,

Oleksandr

Reply via email to