On 14/02/2019 01:11, Andrew Cooper wrote:
> On 13/02/2019 21:08, Michael Labriola wrote:
>> On Wed, Feb 13, 2019 at 3:21 PM Andrew Cooper <andrew.coop...@citrix.com> 
>> wrote:
>>> On 13/02/2019 20:15, Michael Labriola wrote:
>>>> On Wed, Feb 13, 2019 at 2:16 PM Konrad Rzeszutek Wilk
>>>> <konrad.w...@oracle.com> wrote:
>>>>> On Wed, Feb 13, 2019 at 01:38:21PM -0500, Michael Labriola wrote:
>>>>>> On Wed, Feb 13, 2019 at 1:16 PM Michael Labriola
>>>>>> <michael.d.labri...@gmail.com> wrote:
>>>>>>> On Wed, Feb 13, 2019 at 11:57 AM Konrad Rzeszutek Wilk
>>>>>>> <konrad.w...@oracle.com> wrote:
>>>>>>>> On Wed, Feb 13, 2019 at 09:09:32AM -0700, Jan Beulich wrote:
>>>>>>>>>>>> On 13.02.19 at 17:00, <michael.d.labri...@gmail.com> wrote:
>>>>>>>>>> On Wed, Feb 13, 2019 at 9:28 AM Jan Beulich <jbeul...@suse.com> 
>>>>>>>>>> wrote:
>>>>>>>>>>>>>> On 13.02.19 at 15:10, <michael.d.labri...@gmail.com> wrote:
>>>>>>>>>>>> Ah, so this isn't necessarily Xen-specific but rather any 
>>>>>>>>>>>> paravirtual
>>>>>>>>>>>> guest?  That hadn't crossed my mind.  Is there an easy way to find 
>>>>>>>>>>>> out
>>>>>>>>>>>> if we're a pv guest in the need_swiotlb conditionals?
>>>>>>>>>>> There's xen_pv_domain(), but I think xen_swiotlb would be more to
>>>>>>>>>>> the point if the check is already to be Xen-specific. There's no 
>>>>>>>>>>> generic
>>>>>>>>>>> "is PV" predicate that I'm aware of.
>>>>>>>>>> Well, that makes doing conditional code right more difficult.  I
>>>>>>>>>> assume since there isn't a generic predicate, and PV isn't new, that
>>>>>>>>>> it's absence is by design?  To reign in the temptation to sprinkle
>>>>>>>>>> conditional code all over the kernel?  ;-)
>>>>>>>>> Well, with lguest gone, Xen is the only PV environment the kernel
>>>>>>>>> can run in, afaik at least. I guess to decide between the suggested
>>>>>>>>> options or the need for some abstracting macro (or yet something
>>>>>>>>> else), you'll really need to ask the driver maintainers. Or simply
>>>>>>>>> send a patch their way implementing one of them, and see what
>>>>>>>>> their reaction is.
>>>>>>>> Could you try this out and see if it works and I will send it out:
>>>>>>>>
>>>>>> *snip*
>>>>>>> Yes, that works for me.  However, I feel like the conditional should
>>>>>>> be in drm_get_max_iomem() instead of directly after it everywhere it's
>>>>>>> used...  and is just checking xen_pv_domain() enough?  Jan made it
>>>>>>> sound like there were possibly other PV cases that would also still
>>>>>>> need swiotlb.
>>>>>> How about this?  It strcmp's pv_info to see if we're bare metal, does
>>>>>> the comparison in a single place, moves the bit shifting comparison
>>>>>> into the function (simplifying the drm driver code), and renames the
>>>>>> function to more aptly describe what's going on.
>>>>> <nods> That looks much better.
>>>> Great!  Now the only question left is:  KVM, VMware, Xen PVH, Xen HVM,
>>>> and Xen PV all populate pv_info.  Do any of those other than Xen PV
>>>> *really* need swiotlb.  That's slightly over my head.  As written, my
>>>> patch would require swiotlb for all of them because I was attempting
>>>> to not be Xen-specific.
>>> Its far more complicated that "Xen PV requires swiotlb".
>>>
>>> I presume the underlying problem here is DRM being special and not
>>> DMA-mapping its buffers, and trying to DMA to a buffer crossing a 4k
>>> boundary?
>> Well, I don't 100% understand how all these things work...  but here's
>> what I do know.  There are a series of commits in v4.17 that try to
>> optimize the radeon and amdgpu drivers by skipping calls to
>> ttm_dma_populate() and ttm_dma_unpopulate() unless they're "really
>> needed".  The original commit determines if swiotlb is needed by
>> checking to see if the max io mapping address of system memory is over
>> the video card's accessing range.  I can no longer log into Gnome on a
>> Xen dom0 after upgrading my kernel to v4.20 because the code that's no
>> longer happening was actually needed in a paravirtualized environment.
> 
> But from the log you provided, your bug was space exhaustion in the
> swiotlb, no?
> 
>> So, I'm trying to get all my details straight so I can submit a patch
>> to fix it w/out saying anything factually incorrect.
> 
> The thing which is different between Xen PV guests and most others (all
> others(?), now that Lguest and UML have been dropped) is that what Linux
> thinks of as PFN $N isn't necessarily adjacent to PFN $N+1 in system
> physical address space.
> 
> Therefore, code which has a buffer spanning a page boundary can't just
> convert a pointer to the buffer into a physical address, and hand that
> address to a device.  You generally end up with either memory corruption
> (DMA hitting the wrong page allocated to the guest), or an IOMMU fault
> (DMA hitting a pages which isn't allocated to the guest).
> 
> Xen PV is very good at finding DMA bugs in drivers.  The way to resolve
> this is to fix the driver to use the proper DMA APIs - not to add even
> more magic corner cases.
> 
> In general, a lot of devices can do 4k scatter/gather, or end up making
> requests to buffers which fit within a single page, but the SWIOTLB does
> act as a mechanism of last resort.  It has a massive performance penalty
> (due to double buffering), and does have a tendency to fragment (due to
> asymmetric size requests).
> 
> However, there is one DMA mode (in the process of getting properly
> upstream, but has been used for several years by various downstreams)
> where IOVA == Linux's idea of contiguous PFN space, so you can do odd
> sized DMAs which cross page boundaries.
> 
> The point is that the DMA ops (and *only* the DMA ops, from a
> correctness standpoint) know how to convert PFNs into IO-virtual
> addresses for devices, because it may not be a 1:1 mapping.  Nothing
> else in the kernel can legitimately be making decisions like this.

Correct. Adding Christoph who might want to add something.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Reply via email to