On 21.05.2025 08:08, Chen, Jiqian wrote:
> On 2025/5/19 21:21, Roger Pau Monné wrote:
>> On Mon, May 19, 2025 at 03:10:17PM +0200, Jan Beulich wrote:
>>> On 19.05.2025 09:13, Chen, Jiqian wrote:
>>>> On 2025/5/19 14:56, Jan Beulich wrote:
>>>>> On 19.05.2025 08:43, Chen, Jiqian wrote:
>>>>>> On 2025/5/18 22:20, Jan Beulich wrote:
>>>>>>> On 09.05.2025 11:05, Jiqian Chen wrote:
>>>>>>>> @@ -827,6 +827,34 @@ static int vpci_init_capability_list(struct 
>>>>>>>> pci_dev *pdev)
>>>>>>>>                                                   
>>>>>>>> PCI_STATUS_RSVDZ_MASK);
>>>>>>>>  }
>>>>>>>>  
>>>>>>>> +static int vpci_init_ext_capability_list(struct pci_dev *pdev)
>>>>>>>> +{
>>>>>>>> +    unsigned int pos = PCI_CFG_SPACE_SIZE, ttl = 480;
>>>>>>>
>>>>>>> The ttl value exists (in the function you took it from) to make sure
>>>>>>> the loop below eventually ends. That is, to be able to kind of
>>>>>>> gracefully deal with loops in the linked list. Such loops, however,
>>>>>>> would ...
>>>>>>>
>>>>>>>> +    if ( !is_hardware_domain(pdev->domain) )
>>>>>>>> +        /* Extended capabilities read as zero, write ignore for guest 
>>>>>>>> */
>>>>>>>> +        return vpci_add_register(pdev->vpci, vpci_read_val, NULL,
>>>>>>>> +                                 pos, 4, (void *)0);
>>>>>>>> +
>>>>>>>> +    while ( pos >= PCI_CFG_SPACE_SIZE && ttl-- )
>>>>>>>> +    {
>>>>>>>> +        uint32_t header = pci_conf_read32(pdev->sbdf, pos);
>>>>>>>> +        int rc;
>>>>>>>> +
>>>>>>>> +        if ( !header )
>>>>>>>> +            return 0;
>>>>>>>> +
>>>>>>>> +        rc = vpci_add_register(pdev->vpci, vpci_read_val, 
>>>>>>>> vpci_hw_write32,
>>>>>>>> +                               pos, 4, (void *)(uintptr_t)header);
>>>>>>>
>>>>>>> ... mean we may invoke this twice for the same capability. Such
>>>>>>> a secondary invocation would fail with -EEXIST, causing device init
>>>>>>> to fail altogether. Which is kind of against our aim of exposing
>>>>>>> (in a controlled manner) as much of the PCI hardware as possible.
>>>>>> May I know what situation that can make this twice for one capability 
>>>>>> when initialization?
>>>>>> Does hardware capability list have a cycle?
>>>>>
>>>>> Any of this is to work around flawed hardware, I suppose.
>>>>>
>>>>>>> Imo we ought to be using a bitmap to detect the situation earlier
>>>>>>> and hence to be able to avoid redundant register addition. Thoughts?
>>>>>> Can we just let it go forward and continue to add register for next 
>>>>>> capability when rc == -EXIST, instead of returning error ?
>>>>>
>>>>> Possible, but feels wrong.
>>>> How about when EXIST, setting the next bits of previous extended 
>>>> capability to be zero and return 0? Then we break the cycle.
>>>
>>> Hmm. Again an option, yet again I'm not certain. But that's perhaps just
>>> me, and Roger may be fine with it. IOW we might as well start out this way,
>>> and adjust if (ever) an issue with a real device is found.
>>
>> Returning -EEXIST might be fine, but at that point there's no further
>> capability to process.  There's a loop in the linked capability list,
>> and we should just exit.  There needs to be a warning in this case,
>> and since this is for the hardware domain only it shouldn't be fatal.
>>
> If I understand correctly, I need to add below in next version?
> 
>          rc = vpci_add_register(pdev->vpci, vpci_read_val, vpci_hw_write32,
>                                 pos, 4, (void *)(uintptr_t)header);
> +
> +        if ( rc == -EEXIST )
> +        {
> +            printk(XENLOG_WARNING
> +                   "%pd %pp: there is a loop in the linked capability 
> list\n",

I think we shouldn't say "loop" unless we firmly know that's what the
issue is. Maybe use "overlap" instead? And then also log the offending
register range? (As a nit: "there is" and "linked" are not adding any
value to the log message; to keep them short [without losing
information], please try to avoid such.)

Jan

Reply via email to