On 14.02.22 13:25, Roger Pau Monné wrote:
> On Mon, Feb 14, 2022 at 11:15:27AM +0000, Oleksandr Andrushchenko wrote:
>>
>> On 14.02.22 13:11, Roger Pau Monné wrote:
>>> On Mon, Feb 14, 2022 at 10:53:43AM +0000, Oleksandr Andrushchenko wrote:
>>>> On 14.02.22 12:34, Roger Pau Monné wrote:
>>>>> On Mon, Feb 14, 2022 at 09:36:39AM +0000, Oleksandr Andrushchenko wrote:
>>>>>> On 11.02.22 13:40, Roger Pau Monné wrote:
>>>>>>> +
>>>>>>>>>>           for ( i = 0; i < msix->max_entries; i++ )
>>>>>>>>>>           {
>>>>>>>>>>               const struct vpci_msix_entry *entry = 
>>>>>>>>>> &msix->entries[i];
>>>>>>>>> Since this function is now called with the per-domain rwlock read
>>>>>>>>> locked it's likely not appropriate to call process_pending_softirqs
>>>>>>>>> while holding such lock (check below).
>>>>>>>> You are right, as it is possible that:
>>>>>>>>
>>>>>>>> process_pending_softirqs -> vpci_process_pending -> read_lock
>>>>>>>>
>>>>>>>> Even more, vpci_process_pending may also
>>>>>>>>
>>>>>>>> read_unlock -> vpci_remove_device -> write_lock
>>>>>>>>
>>>>>>>> in its error path. So, any invocation of process_pending_softirqs
>>>>>>>> must not hold d->vpci_rwlock at least.
>>>>>>>>
>>>>>>>> And also we need to check that pdev->vpci was not removed
>>>>>>>> in between or *re-created*
>>>>>>>>> We will likely need to re-iterate over the list of pdevs assigned to
>>>>>>>>> the domain and assert that the pdev is still assigned to the same
>>>>>>>>> domain.
>>>>>>>> So, do you mean a pattern like the below should be used at all
>>>>>>>> places where we need to call process_pending_softirqs?
>>>>>>>>
>>>>>>>> read_unlock
>>>>>>>> process_pending_softirqs
>>>>>>>> read_lock
>>>>>>>> pdev = pci_get_pdev_by_domain(d, sbdf.seg, sbdf.bus, sbdf.devfn);
>>>>>>>> if ( pdev && pdev->vpci && is_the_same_vpci(pdev->vpci) )
>>>>>>>> <continue processing>
>>>>>>> Something along those lines. You likely need to continue iterate using
>>>>>>> for_each_pdev.
>>>>>> How do we tell if pdev->vpci is the same? Jan has already brought
>>>>>> this question before [1] and I was about to use some ID for that purpose:
>>>>>> pdev->vpci->id = d->vpci_id++ and then we use pdev->vpci->id  for checks
>>>>> Given this is a debug message I would be OK with just doing the
>>>>> minimal checks to prevent Xen from crashing (ie: pdev->vpci exists)
>>>>> and that the resume MSI entry is not past the current limit. Otherwise
>>>>> just print a message and move on to the next device.
>>>> Agree, I see no big issue (probably) if we are not able to print
>>>>
>>>> How about this one:
>>>>
>>>> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
>>>> index 809a6b4773e1..50373f04da82 100644
>>>> --- a/xen/drivers/vpci/header.c
>>>> +++ b/xen/drivers/vpci/header.c
>>>> @@ -171,10 +171,31 @@ static int __init apply_map(struct domain *d, const 
>>>> struct pci_dev *pdev,
>>>>                                 struct rangeset *mem, uint16_t cmd)
>>>>     {
>>>>         struct map_data data = { .d = d, .map = true };
>>>> +    pci_sbdf_t sbdf = pdev->sbdf;
>>>>         int rc;
>>>>
>>>> + ASSERT(rw_is_write_locked(&pdev->domain->vpci_rwlock));
>>>> +
>>>>         while ( (rc = rangeset_consume_ranges(mem, map_range, &data)) == 
>>>> -ERESTART )
>>>> +    {
>>>> +
>>>> +        /*
>>>> +         * process_pending_softirqs may trigger vpci_process_pending which
>>>> +         * may need to acquire pdev->domain->vpci_rwlock in read mode.
>>>> +         */
>>>> +        write_unlock(&pdev->domain->vpci_rwlock);
>>>>             process_pending_softirqs();
>>>> +        write_lock(&pdev->domain->vpci_rwlock);
>>>> +
>>>> +        /* Check if pdev still exists and vPCI was not removed or 
>>>> re-created. */
>>>> +        if (pci_get_pdev_by_domain(d, sbdf.seg, sbdf.bus, sbdf.devfn) != 
>>>> pdev)
>>>> +            if ( vpci is NOT the same )
>>>> +            {
>>>> +                rc = 0;
>>>> +                break;
>>>> +            }
>>>> +    }
>>>> +
>>>>         rangeset_destroy(mem);
>>>>         if ( !rc )
>>>>             modify_decoding(pdev, cmd, false);
>>>>
>>>> This one also wants process_pending_softirqs to run so it *might*
>>>> want pdev and vpci checks. But at the same time apply_map runs
>>>> at ( system_state < SYS_STATE_active ), so defer_map won't be
>>>> running yet, thus no vpci_process_pending is possible yet (in terms
>>>> it has something to do yet). So, I think we just need:
>>>>
>>>>            write_unlock(&pdev->domain->vpci_rwlock);
>>>>            process_pending_softirqs();
>>>>            write_lock(&pdev->domain->vpci_rwlock);
>>>>
>>>> and this should be enough
>>> Given the context apply_map is called from (dom0 specific init code),
>>> there's no need to check for the pdev to still exits, or whether vpci
>>> has been recreated, as it's not possible. Just add a comment to
>>> explicitly note that the context of the function is special, and thus
>>> there's no possibility of either the device or vpci going away.
>> Does it really need write_unlock/write_lock given the context?...
> I think it's bad practice to call process_pending_softirqs while
> holding any locks. This is a very specific context so it's likely fine
> to not drop the lock, but would still seem incorrect to me.
Ok
>
>> I think it doesn't as there is no chance defer_map is called, thus
>> process_pending_softirqs -> vpci_process_pending -> read_lock
> Indeed, there's no chance of that because process_pending_softirqs
> will never try to do a scheduling operation that would result in our
> context being scheduled out.

     while ( (rc = rangeset_consume_ranges(mem, map_range, &data)) == -ERESTART 
)
     {
         /*
          * FIXME: Given the context apply_map is called from (dom0 specific
          * init code at system_state < SYS_STATE_active) it is not strictly
          * required that pdev->domain->vpci_rwlock is unlocked before calling
          * process_pending_softirqs as there is no contention possible between
          * this code and vpci_process_pending trying to acquire the lock in
          * read mode. But running process_pending_softirqs with any lock held
          * doesn't seem to be a good practice, so drop the lock and re-acquire
          * it right again.
          */
         write_unlock(&pdev->domain->vpci_rwlock);
         process_pending_softirqs();
         write_lock(&pdev->domain->vpci_rwlock);
     }

Will this be good enough?
>
> Thanks, Roger.
>
Thank you,
Oleksandr

Reply via email to