On Wed, 30 Apr 2025, Roger Pau Monné wrote:
> On Wed, Apr 30, 2025 at 08:27:55AM +0200, Jan Beulich wrote:
> > On 29.04.2025 23:52, Stefano Stabellini wrote:
> > > On Tue, 29 Apr 2025, Jan Beulich wrote:
> > >> On 28.04.2025 22:00, Stefano Stabellini wrote:
> > >>> On Mon, 28 Apr 2025, Jan Beulich wrote:
> > >>>> On 25.04.2025 22:19, Stefano Stabellini wrote:
> > >>>>> --- a/xen/arch/x86/mm.c
> > >>>>> +++ b/xen/arch/x86/mm.c
> > >>>>> @@ -4401,7 +4401,7 @@ int steal_page(
> > >>>>>      const struct domain *owner;
> > >>>>>      int rc;
> > >>>>>  
> > >>>>> -    if ( paging_mode_external(d) )
> > >>>>> +    if ( paging_mode_external(d) && !is_hardware_domain(d) )
> > >>>>>          return -EOPNOTSUPP;
> > >>>>>  
> > >>>>>      /* Grab a reference to make sure the page doesn't change under 
> > >>>>> our feet */
> > >>>>
> > >>>> Is this (in particular the code following below here) a safe thing to 
> > >>>> do
> > >>>> when we don't properly refcount page references from the P2M, yet? It's
> > >>>> Dom0, yes, but even there I might see potential security implications 
> > >>>> (as
> > >>>> top violating privacy of a guest).
> > >>>
> > >>> I don't think I am following, could you please elaborate more? The
> > >>> change I am proposing is to allow Dom0 to share its own pages to the
> > >>> co-processor. DomUs are not in the picture. I would be happy to add
> > >>> further restriction to that effect. Is there something else you have in
> > >>> mind?
> > >>
> > >> Once "shared" with the PSP, how would Xen know that this sharing has 
> > >> stopped?
> > >> Without knowing, how could it safely give the same page to a DomU later 
> > >> on?
> > >> ("Safely" in both directions: Without compromising privacy of the DomU 
> > >> and
> > >> without compromising host safety / security.)
> > > 
> > > Why would Xen later assign the same page to a DomU? The page comes
> > > from the hardware domain, which, as of today, cannot be destroyed. BTW I
> > > realize it is a bit different, but we have been doing the same thing
> > > with Dom0 1:1 mapped on ARM since the start of the project.
> > 
> > The life cycle of the page within Dom0 may be such that a need arises to
> > move it elsewhere (balloon out, grant-transfer, and what not).
> 
> I think it's up to dom0 to make sure the page is handled
> appropriately, in order for it to keep it's special contiguity
> properties.
> 
> If the PSP is not using the IOMMU page-tables for DMA accesses, and
> the hardware domain can freely interact with it, there's no protection
> from such device accessing any random MFN on the system, and hence no
> refcounts or similar will protect from that.

Yes, exactly!


> The only protection would be Xen owning the device, and the hardware
> domain using an emulated/mediated interface to communicate with it.  I
> have no idea how complicated the PSP interface is, and whether it
> would be feasible to trap and emulate/mediate accesses in Xen.

There will be always the possibility of devices or co-processors or
firmware not behind the IOMMU and we won't be able to handle them all in
Xen.


> > >>>>> --- a/xen/common/memory.c
> > >>>>> +++ b/xen/common/memory.c
> > >>>>> @@ -794,7 +794,7 @@ static long 
> > >>>>> memory_exchange(XEN_GUEST_HANDLE_PARAM(xen_memory_exchange_t) arg)
> > >>>>>              rc = guest_physmap_add_page(d, _gfn(gpfn), mfn,
> > >>>>>                                          exch.out.extent_order) ?: rc;
> > >>>>>  
> > >>>>> -            if ( !paging_mode_translate(d) &&
> > >>>>> +            if ( (!paging_mode_translate(d) || 
> > >>>>> is_hardware_domain(d)) &&
> > >>>>>                   __copy_mfn_to_guest_offset(exch.out.extent_start,
> > >>>>>                                              (i << out_chunk_order) + 
> > >>>>> j,
> > >>>>>                                              mfn) )
> > >>>>
> > >>>> Wait, no: A PVH domain (Dom0 or not) can't very well make use of MFNs, 
> > >>>> can
> > >>>> it?
> > >>>
> > >>> One way or another Dom0 PVH needs to know the MFN to pass it to the
> > >>> co-processor.
> > >>
> > >> I see. That's pretty odd, though. I'm then further concerned of the 
> > >> order of
> > >> the chunks. At present we're rather lax, in permitting PVH and PV Dom0 
> > >> the
> > >> same upper bound. With both CPU and I/O side translation there is, in
> > >> principle, no reason to permit any kind of contiguity. Of course there's 
> > >> a
> > >> performance aspect, but that hardly matters in the specific case here. 
> > >> Yet at
> > >> the same time, once we expose MFNs, contiguity will start mattering as 
> > >> soon
> > >> as any piece of memory needs to be larger than PAGE_SIZE. Which means it 
> > >> will
> > >> make tightening of the presently lax handling prone to regressions in 
> > >> this
> > >> new behavior you're introducing. What chunk size does the PSP driver 
> > >> require?
> > > 
> > > I don't know. The memory returned by XENMEM_exchange is contiguous,
> > > right? Are you worried that Xen cannot allocate the requested amount of
> > > memory contiguously?
> > 
> > That would be Dom0's problem then. But really for a translated guest the
> > exchanged chunks being contiguous shouldn't matter, correctness-wise. That 
> > is,
> > within Xen, rather than failing a request, we could choose to retry using
> > discontiguous chunks (contiguous only in GFN space). Such an (afaict)
> > otherwise correct change would break your use case, as it would invalidate 
> > the
> > MFN information passed back. (This fallback approach would similarly apply 
> > to
> > other related mem-ops. It's just that during domain creation the tool stack
> > has its own fallback, so it may not be of much use right now.)
> 
> I think the description in the public header needs to be expanded to
> specify what the XENMEM_exchange does for translated guests, and
> clearly write down that the underlying MFNs for the exchanged region
> will be contiguous.  Possibly a new XENMEMF_ flag needs to be added to
> request contiguous physical memory for the new range.
> 
> Sadly this also has the side effect of quite likely shattering
> superpages for dom0 EPT/NPT, which will result in decreased dom0
> performance.
> 
> We have so far avoided exposing MFNs to HVM/PVH, but I don't see much
> way to avoid this if there's no option to use IOMMU or NPT page-tables
> with the PSP and we don't want to intercept PSP accesses in Xen and
> translate requests on the fly.
 
Yeah, I think the same way too.

Reply via email to