On Tue, Jul 4, 2023 at 5:49 PM Roger Pau Monné <roger....@citrix.com> wrote:
Hello all. [sorry for the possible format issues] On Tue, Jul 04, 2023 at 01:43:46PM +0200, Marek Marczykowski-Górecki wrote: > > Hi, > > > > FWIW, I have ran into this issue some time ago too. I run Xen on top of > > KVM and then passthrough some of the virtio devices (network one > > specifically) into a (PV) guest. So, I hit both cases, the dom0 one and > > domU one. As a temporary workaround I needed to disable > > CONFIG_XEN_VIRTIO completely (just disabling > > CONFIG_XEN_VIRTIO_FORCE_GRANT was not enough to fix it). > > With that context in place, the actual response below. > > > > On Tue, Jul 04, 2023 at 12:39:40PM +0200, Juergen Gross wrote: > > > On 04.07.23 09:48, Roger Pau Monné wrote: > > > > On Thu, Jun 29, 2023 at 03:44:04PM -0700, Stefano Stabellini wrote: > > > > > On Thu, 29 Jun 2023, Oleksandr Tyshchenko wrote: > > > > > > On 29.06.23 04:00, Stefano Stabellini wrote: > > > > > > > I think we need to add a second way? It could be anything that > can help > > > > > > > us distinguish between a non-grants-capable virtio backend and > a > > > > > > > grants-capable virtio backend, such as: > > > > > > > - a string on xenstore > > > > > > > - a xen param > > > > > > > - a special PCI configuration register value > > > > > > > - something in the ACPI tables > > > > > > > - the QEMU machine type > > > > > > > > > > > > > > > > > > Yes, I remember there was a discussion regarding that. The point > is to > > > > > > choose a solution to be functional for both PV and HVM *and* to > be able > > > > > > to support a hotplug. IIRC, the xenstore could be a possible > candidate. > > > > > > > > > > xenstore would be among the easiest to make work. The only > downside is > > > > > the dependency on xenstore which otherwise virtio+grants doesn't > have. > > > > > > > > I would avoid introducing a dependency on xenstore, if nothing else > we > > > > know it's a performance bottleneck. > > > > > > > > We would also need to map the virtio device topology into xenstore, > so > > > > that we can pass different options for each device. > > > > > > This aspect (different options) is important. How do you want to pass > virtio > > > device configuration parameters from dom0 to the virtio backend > domain? You > > > probably need something like Xenstore (a virtio based alternative like > virtiofs > > > would work, too) for that purpose. > > > > > > Mapping the topology should be rather easy via the PCI-Id, e.g.: > > > > > > /local/domain/42/device/virtio/0000:00:1c.0/backend > > > > While I agree this would probably be the simplest to implement, I don't > > like introducing xenstore dependency into virtio frontend either. > > Toolstack -> backend communication is probably easier to solve, as it's > > much more flexible (could use qemu cmdline, QMP, other similar > > mechanisms for non-qemu backends etc). > > I also think features should be exposed uniformly for devices, it's at > least weird to have certain features exposed in the PCI config space > while other features exposed in xenstore. > > For virtio-mmio this might get a bit confusing, are we going to add > xenstore entries based on the position of the device config mmio > region? > > I think on Arm PCI enumeration is not (usually?) done by the firmware, > at which point the SBDF expected by the tools/backend might be > different than the value assigned by the guest OS. > > I think there are two slightly different issues, one is how to pass > information to virtio backends, I think doing this initially based on > xenstore is not that bad, because it's an internal detail of the > backend implementation. However passing information to virtio > frontends using xenstore is IMO a bad idea, there's already a way to > negotiate features between virtio frontends and backends, and Xen > should just expand and use that. > > On Arm with device-tree we have a special bindings which purpose is to inform us whether we need to use grants for virtio and backend domid for a particular device.Here on x86, we don't have a device tree, so cannot (easily?) reuse this logic. I have just recollected one idea suggested by Stefano some time ago [1]. The context of discussion was about what to do when device-tree and ACPI cannot be reused (or something like that).The idea won't cover virtio-mmio, but I have heard that virtio-mmio usage with x86 Xen is rather unusual case. I will paste the text below for convenience. ********** Part 1 (intro): We could reuse a PCI config space register to expose the backend id. However this solution requires a backend change (QEMU) to expose the backend id via an emulated register for each emulated device. To avoid having to introduce a special config space register in all emulated PCI devices (virtio-net, virtio-block, etc) I wonder if we could add a special PCI config space register at the emulated PCI Root Complex level. Basically the workflow would be as follow: - Linux recognizes the PCI Root Complex as a Xen PCI Root Complex - Linux writes to special PCI config space register of the Xen PCI Root Complex the PCI device id (basically the BDF) - The Xen PCI Root Complex emulated by Xen answers by writing back to the same location the backend id (domid of the backend) - Linux reads back the same PCI config space register of the Xen PCI Root Complex and learn the relevant domid Part 2 (clarification): I think using a special config space register in the root complex would not be terrible in terms of guest changes because it is easy to introduce a new root complex driver in Linux and other OSes. The root complex would still be ECAM compatible so the regular ECAM driver would still work. A new driver would only be necessary if you want to be able to access the special config space register. ********** What do you think about it? Are there any pitfalls, etc? This also requires system changes, but at least without virtio spec changes. [1] https://lore.kernel.org/xen-devel/alpine.DEB.2.22.394.2210061747590.3690179@ubuntu-linux-20-04-desktop/ -- Regards, Oleksandr Tyshchenko