Hi Julien,
On 12/3/25 11:32, Julien Grall wrote:
Hi,
On 02/12/2025 22:08, Milan Djokic wrote:
Hi Julien,
On 11/27/25 11:22, Julien Grall wrote:
We have changed vIOMMU design from 1-N to N-N mapping between vIOMMU and
pIOMMU. Considering single vIOMMU model limitation pointed out by
Volodymyr (SID overlaps), vIOMMU-per-pIOMMU model turned out to be the
only proper solution.
> Does this means in your solution you will end up with multiple
> vPCI as well and then map pBDF == vBDF? (this because the SID have to be
> fixed at boot)
>
To answer your question, yes we will have multiple vPCI nodes with this
model, establishing 1-1 vSID-pSID mapping (same iommu-map range between
pPCI-vPCI).
For pBDF to vBDF 1-1 mapping, I'm not sure if this is necessary. My
understanding is that vBDF->pBDF mapping does not affect vSID->pSID
mapping. Am I wrong here?
From my understanding, the mapping between a vBDF and vSID is setup at
domain creation (as this is described in ACPI/Device-Tree). As PCI
devices can be hotplug, if you want to enforce vSID == pSID, then you
indirectly need to enforce vBDF == pBDF.
I was not aware of that. I will have to do a detailed analysis on this
and come back with a solution. Right now I'm not sure how and if
enumeration will work with multi vIOMMU/vPCI model. If that's not
possible, we will have to introduce a mapping layer for vSID->pSID and
go back to single vPCI/vIOMMU model.
[...]
- **Runtime Configuration**: Introduces a `viommu` boot parameter for
dynamic enablement.
Separate vIOMMU device is exposed to guest for every physical IOMMU in
the system.
vIOMMU feature is designed in a way to provide a generic vIOMMU
framework and a backend implementation
for target IOMMU as separate components.
Backend implementation contains specific IOMMU structure and commands
handling (only SMMUv3 currently supported).
This structure allows potential reuse of stage-1 feature for other IOMMU
types.
Security Considerations
=======================
**viommu security benefits:**
- Stage-1 translation ensures guest devices cannot perform unauthorized
DMA (device I/O address mapping managed by guest).
- Emulated IOMMU removes guest direct dependency on IOMMU hardware,
while maintaining domains isolation.
Sorry, I don't follow this argument. Are you saying that it would be
possible to emulate a SMMUv3 vIOMMU on top of the IPMMU?
No, this would not work. Emulated IOMMU has to match with the pIOMMU type.
The argument only points out that we are emulating IOMMU, so the guest
does not need direct HW interface for IOMMU functions.
Sorry, but I am still missing how this is a security benefits.
Yes, this is a mistake. This should be in the design section.
[...]
2. Observation:
---------------
Guests can now invalidate Stage-1 caches; invalidation needs forwarding
to SMMUv3 hardware to maintain coherence.
**Risk:**
Failing to propagate cache invalidation could allow stale mappings,
enabling access to old mappings and possibly
data leakage or misrouting.
You are referring to data leakage/misrouting between two devices own by
the same guest, right? Xen would still be in charge of flush when the
stage-2 is updated.
Yes, this risk could affect only guests, not xen.
But it would affect a single guest right? IOW, it is not possible for
guest A to leak data to guest B even if we don't properly invalidate
stage-1. Correct?
Correct. I don't see any possible scenario for data leakage between
different guests, just between 2 devices assigned to the same guest.
I will elaborate on this risk to make it clearer.
4. Observation:
---------------
The code includes transformations to handle nested translation versus
standard modes and uses guest-configured
command queues (e.g., `CMD_CFGI_STE`) and event notifications.
**Risk:**
Malicious or malformed queue commands from guests could bypass
validation, manipulate SMMUv3 state,
or cause system instability.
**Mitigation:** *(Handled by design)*
Built-in validation of command queue entries and sanitization mechanisms
ensure only permitted configurations
are applied.
This is true as long as we didn't make an mistake in the
configurations ;).
Yes, but I don’t see anything we can do to prevent configuration mistakes.
There is nothing really preventing it. Same for ...
This is supported via additions in `vsmmuv3` and `cmdqueue`
handling code.
5. Observation:
---------------
Device Tree modifications enable device assignment and configuration
through guest DT fragments (e.g., `iommus`)
are added via `libxl`.
**Risk:**
Erroneous or malicious Device Tree injection could result in device
misbinding or guest access to unauthorized
hardware.
The DT fragment are not security support and will never be at least
until you have can a libfdt that is able to detect malformed Device-Tree
(I haven't checked if this has changed recently).
But this should still be considered a risk? Similar to the previous
observation, system integrator should ensure that DT fragments are correct.
... this one. I agree they are risks, but they don't provide much input
in the design of the vIOMMU.
I get your point. I can remove them if considered to be overhead in this
context.
I am a lot more concerned for the scheduling part because the resources
are shared.
My understanding is there is only a single physical event queue. Xen
would be responsible to handle the events in the queue and forward to
the respective guests. If so, it is not clear what you mean by "disable
event queue".
I was referring to emulated IOMMU event queue. The idea is to make it
optional for guests. When disabled, events won't be propagated to the
guest.
But Xen will still receive the events, correct? If so, how does it make
it better?
You are correct, Xen will still receive events and handle them in pIOMMU
driver. This is only a mitigation for the part introduced by vIOMMU
design (events emulation), not the complete solution. This risk has more
general context and could be related to stage-2 only guests also (e.g.
guests that perform DMA to an address they are not allowed to access,
causing translation faults).
But imo mitigation for the physical event queue flooding should be part
of the pIOMMU driver design
Best regards,
Milan