On 13/02/2019 21:08, Michael Labriola wrote:
> On Wed, Feb 13, 2019 at 3:21 PM Andrew Cooper <andrew.coop...@citrix.com> 
> wrote:
>> On 13/02/2019 20:15, Michael Labriola wrote:
>>> On Wed, Feb 13, 2019 at 2:16 PM Konrad Rzeszutek Wilk
>>> <konrad.w...@oracle.com> wrote:
>>>> On Wed, Feb 13, 2019 at 01:38:21PM -0500, Michael Labriola wrote:
>>>>> On Wed, Feb 13, 2019 at 1:16 PM Michael Labriola
>>>>> <michael.d.labri...@gmail.com> wrote:
>>>>>> On Wed, Feb 13, 2019 at 11:57 AM Konrad Rzeszutek Wilk
>>>>>> <konrad.w...@oracle.com> wrote:
>>>>>>> On Wed, Feb 13, 2019 at 09:09:32AM -0700, Jan Beulich wrote:
>>>>>>>>>>> On 13.02.19 at 17:00, <michael.d.labri...@gmail.com> wrote:
>>>>>>>>> On Wed, Feb 13, 2019 at 9:28 AM Jan Beulich <jbeul...@suse.com> wrote:
>>>>>>>>>>>>> On 13.02.19 at 15:10, <michael.d.labri...@gmail.com> wrote:
>>>>>>>>>>> Ah, so this isn't necessarily Xen-specific but rather any 
>>>>>>>>>>> paravirtual
>>>>>>>>>>> guest?  That hadn't crossed my mind.  Is there an easy way to find 
>>>>>>>>>>> out
>>>>>>>>>>> if we're a pv guest in the need_swiotlb conditionals?
>>>>>>>>>> There's xen_pv_domain(), but I think xen_swiotlb would be more to
>>>>>>>>>> the point if the check is already to be Xen-specific. There's no 
>>>>>>>>>> generic
>>>>>>>>>> "is PV" predicate that I'm aware of.
>>>>>>>>> Well, that makes doing conditional code right more difficult.  I
>>>>>>>>> assume since there isn't a generic predicate, and PV isn't new, that
>>>>>>>>> it's absence is by design?  To reign in the temptation to sprinkle
>>>>>>>>> conditional code all over the kernel?  ;-)
>>>>>>>> Well, with lguest gone, Xen is the only PV environment the kernel
>>>>>>>> can run in, afaik at least. I guess to decide between the suggested
>>>>>>>> options or the need for some abstracting macro (or yet something
>>>>>>>> else), you'll really need to ask the driver maintainers. Or simply
>>>>>>>> send a patch their way implementing one of them, and see what
>>>>>>>> their reaction is.
>>>>>>> Could you try this out and see if it works and I will send it out:
>>>>>>>
>>>>> *snip*
>>>>>> Yes, that works for me.  However, I feel like the conditional should
>>>>>> be in drm_get_max_iomem() instead of directly after it everywhere it's
>>>>>> used...  and is just checking xen_pv_domain() enough?  Jan made it
>>>>>> sound like there were possibly other PV cases that would also still
>>>>>> need swiotlb.
>>>>> How about this?  It strcmp's pv_info to see if we're bare metal, does
>>>>> the comparison in a single place, moves the bit shifting comparison
>>>>> into the function (simplifying the drm driver code), and renames the
>>>>> function to more aptly describe what's going on.
>>>> <nods> That looks much better.
>>> Great!  Now the only question left is:  KVM, VMware, Xen PVH, Xen HVM,
>>> and Xen PV all populate pv_info.  Do any of those other than Xen PV
>>> *really* need swiotlb.  That's slightly over my head.  As written, my
>>> patch would require swiotlb for all of them because I was attempting
>>> to not be Xen-specific.
>> Its far more complicated that "Xen PV requires swiotlb".
>>
>> I presume the underlying problem here is DRM being special and not
>> DMA-mapping its buffers, and trying to DMA to a buffer crossing a 4k
>> boundary?
> Well, I don't 100% understand how all these things work...  but here's
> what I do know.  There are a series of commits in v4.17 that try to
> optimize the radeon and amdgpu drivers by skipping calls to
> ttm_dma_populate() and ttm_dma_unpopulate() unless they're "really
> needed".  The original commit determines if swiotlb is needed by
> checking to see if the max io mapping address of system memory is over
> the video card's accessing range.  I can no longer log into Gnome on a
> Xen dom0 after upgrading my kernel to v4.20 because the code that's no
> longer happening was actually needed in a paravirtualized environment.

But from the log you provided, your bug was space exhaustion in the
swiotlb, no?

> So, I'm trying to get all my details straight so I can submit a patch
> to fix it w/out saying anything factually incorrect.

The thing which is different between Xen PV guests and most others (all
others(?), now that Lguest and UML have been dropped) is that what Linux
thinks of as PFN $N isn't necessarily adjacent to PFN $N+1 in system
physical address space.

Therefore, code which has a buffer spanning a page boundary can't just
convert a pointer to the buffer into a physical address, and hand that
address to a device.  You generally end up with either memory corruption
(DMA hitting the wrong page allocated to the guest), or an IOMMU fault
(DMA hitting a pages which isn't allocated to the guest).

Xen PV is very good at finding DMA bugs in drivers.  The way to resolve
this is to fix the driver to use the proper DMA APIs - not to add even
more magic corner cases.

In general, a lot of devices can do 4k scatter/gather, or end up making
requests to buffers which fit within a single page, but the SWIOTLB does
act as a mechanism of last resort.  It has a massive performance penalty
(due to double buffering), and does have a tendency to fragment (due to
asymmetric size requests).

However, there is one DMA mode (in the process of getting properly
upstream, but has been used for several years by various downstreams)
where IOVA == Linux's idea of contiguous PFN space, so you can do odd
sized DMAs which cross page boundaries.

The point is that the DMA ops (and *only* the DMA ops, from a
correctness standpoint) know how to convert PFNs into IO-virtual
addresses for devices, because it may not be a 1:1 mapping.  Nothing
else in the kernel can legitimately be making decisions like this.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Reply via email to