On 10/2/19 10:10 AM, Andrew Cooper wrote:
> On 02/10/2019 09:40, Jan Beulich wrote:
>> On 01.10.2019 17:11, Paul Durrant wrote:
>>> Now that xl.cfg has an option to explicitly enable IOMMU mappings for a
>>> domain, migration may be needlessly vetoed due to the check of
>>> is_iommu_enabled() in paging_log_dirty_enable().
>>> There is actually no need to prevent logdirty from being enabled unless
>>> devices are assigned to a domain and that domain is sharing HAP mappings
>>> with the IOMMU (in which case disabling write permissions in the P2M may
>>> cause DMA faults). It is quite possible that some assigned devices may
>>> provide information about which pages may have been dirtied by DMA via
>>> an API exported by their managing emulator. Thus Xen's logdirty map is only
>>> one source of information that may be available to the toolstack when
>>> performing a migration and hence it is the toolstack that is best placed
>>> to decide under what circumstances it can be performed, not the hypervisor.
>> While I'm happy about the extended description, it's still written in
>> a way suggesting that this is the only possible way of viewing things.
>> As expressed by George and me, putting the hypervisor in a position to
>> be able to judge is at least an alternative worth considering.
> 
> No, for exactly the same reason as I'm purging the disable_migrate flag.
> 
> This is totally backwards thinking, because the check is in the wrong place.
> 
> There really are cases where the toolstack, *and only* the toolstack is
> in a position to determine migration safety.  When it comes to
> disable_migrate, the area under argument is the ITSC flag, which *is*
> safe to offer on migrate for viridian guests which are known to use
> reference_tsc, or if the destination hardware supports tsc scaling. 
> (Hilariously, nothing, not even the toolstack, prohibits migration based
> on Xen's no-migrate flag, because its a write-only field which can't be
> retrieved by the tools.)
> 
> The two options are:
> 
> 1) New hypercall,
> DOMCTL_the_toolstack_knows_wtf_its_doing_so_let_the_doimain_migrate,
> which disables the vetos,
> 
> or
> 
> 2) Delete the erroneous vetos, and trust that the toolstack knows what
> it is doing, and will only initiate a migrate in safe situations.
> 
> Option 2 has the safety checks perfomed at the level which is actually
> capable of calculating the results correctly.
> 
> One of these options is substantially less bone-headed than the other.

Indeed, duplicating the knowledge of the internal details of how
logdirty works in every single copy of the toolstack is a boneheaded
idea. </sarcasm>

Now can we please drop the inflammatory language?

At the moment, this patch (if I understand correctly) will cause a
regression in xl: before this patch, `xl migrate` would correctly fail
to start migration if pci devices were assigned; after this patch, `xl
migrate` will go through with migration, only to potentially have a
garbled domain on the far side.  Is that not correct?

Having a flag to paging_log_dirty_enable() which says, "Ignore device
conflicts, I can get logdirty information from external sources" is a
perfectly sensible interface: it allows more capable toolstacks to
specify their capabilities, while allowing less capable toolstacks not
to need to know the internals of Xen.

And there's yet another option you don't list here which I thought I'd
mentioned: the emulators and/or the toolstack which report logdirty
information to the toolstack can tell Xen, "I am providing logdirty
information for device $BDF."  Then the logdirty check can be, "Are all
devices assigned to the domain providing logdirty information?" And Xen
can fail the migration if there's a device assigned that doesn't have
this flag set.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Reply via email to