On 2025/5/19 21:21, Roger Pau Monné wrote:
> On Mon, May 19, 2025 at 03:10:17PM +0200, Jan Beulich wrote:
>> On 19.05.2025 09:13, Chen, Jiqian wrote:
>>> On 2025/5/19 14:56, Jan Beulich wrote:
>>>> On 19.05.2025 08:43, Chen, Jiqian wrote:
>>>>> On 2025/5/18 22:20, Jan Beulich wrote:
>>>>>> On 09.05.2025 11:05, Jiqian Chen wrote:
>>>>>>> @@ -827,6 +827,34 @@ static int vpci_init_capability_list(struct 
>>>>>>> pci_dev *pdev)
>>>>>>>                                                   
>>>>>>> PCI_STATUS_RSVDZ_MASK);
>>>>>>>  }
>>>>>>>  
>>>>>>> +static int vpci_init_ext_capability_list(struct pci_dev *pdev)
>>>>>>> +{
>>>>>>> +    unsigned int pos = PCI_CFG_SPACE_SIZE, ttl = 480;
>>>>>>
>>>>>> The ttl value exists (in the function you took it from) to make sure
>>>>>> the loop below eventually ends. That is, to be able to kind of
>>>>>> gracefully deal with loops in the linked list. Such loops, however,
>>>>>> would ...
>>>>>>
>>>>>>> +    if ( !is_hardware_domain(pdev->domain) )
>>>>>>> +        /* Extended capabilities read as zero, write ignore for guest 
>>>>>>> */
>>>>>>> +        return vpci_add_register(pdev->vpci, vpci_read_val, NULL,
>>>>>>> +                                 pos, 4, (void *)0);
>>>>>>> +
>>>>>>> +    while ( pos >= PCI_CFG_SPACE_SIZE && ttl-- )
>>>>>>> +    {
>>>>>>> +        uint32_t header = pci_conf_read32(pdev->sbdf, pos);
>>>>>>> +        int rc;
>>>>>>> +
>>>>>>> +        if ( !header )
>>>>>>> +            return 0;
>>>>>>> +
>>>>>>> +        rc = vpci_add_register(pdev->vpci, vpci_read_val, 
>>>>>>> vpci_hw_write32,
>>>>>>> +                               pos, 4, (void *)(uintptr_t)header);
>>>>>>
>>>>>> ... mean we may invoke this twice for the same capability. Such
>>>>>> a secondary invocation would fail with -EEXIST, causing device init
>>>>>> to fail altogether. Which is kind of against our aim of exposing
>>>>>> (in a controlled manner) as much of the PCI hardware as possible.
>>>>> May I know what situation that can make this twice for one capability 
>>>>> when initialization?
>>>>> Does hardware capability list have a cycle?
>>>>
>>>> Any of this is to work around flawed hardware, I suppose.
>>>>
>>>>>> Imo we ought to be using a bitmap to detect the situation earlier
>>>>>> and hence to be able to avoid redundant register addition. Thoughts?
>>>>> Can we just let it go forward and continue to add register for next 
>>>>> capability when rc == -EXIST, instead of returning error ?
>>>>
>>>> Possible, but feels wrong.
>>> How about when EXIST, setting the next bits of previous extended capability 
>>> to be zero and return 0? Then we break the cycle.
>>
>> Hmm. Again an option, yet again I'm not certain. But that's perhaps just
>> me, and Roger may be fine with it. IOW we might as well start out this way,
>> and adjust if (ever) an issue with a real device is found.
> 
> Returning -EEXIST might be fine, but at that point there's no further
> capability to process.  There's a loop in the linked capability list,
> and we should just exit.  There needs to be a warning in this case,
> and since this is for the hardware domain only it shouldn't be fatal.
> 
If I understand correctly, I need to add below in next version?

         rc = vpci_add_register(pdev->vpci, vpci_read_val, vpci_hw_write32,
                                pos, 4, (void *)(uintptr_t)header);
+
+        if ( rc == -EEXIST )
+        {
+            printk(XENLOG_WARNING
+                   "%pd %pp: there is a loop in the linked capability list\n",
+                   pdev->domain, &pdev->sbdf);
+            return 0;
+        }
+
         if ( rc )
             return rc;

> If it was for domUs we would possibly need to discuss whether
> assigning the device should fail if a capability linked list loop is
> found.
> 
> Thanks, Roger.

-- 
Best regards,
Jiqian Chen.

Reply via email to