On 16/01/2023 8:56 am, Jan Beulich wrote:
> On 13.01.2023 12:55, Andrew Cooper wrote:
>> On 13/01/2023 8:47 am, Jan Beulich wrote:
>>> In _sh_propagate() I'm further puzzled: The iomem_access_permitted()
>>> certainly suggests very bad things could happen if it returned false
>>> (i.e. in the implicit "else" case). The assumption looks to be that no
>>> bad "target_mfn" can make it there. But overall things might end up
>>> looking more sane (and being cheaper) when simply using "mmio_mfn"
>>> instead.
>> That entire block looks suspect.  For one, I can't see why the ASSERT()
>> is correct; we have literally just (conditionally) added CACHE_ATTR to
>> pass_thru_flags and pulled everything across from gflags into sflags.
> That is for !shadow_mode_refcounts() domains, i.e. PV, whereas the
> outermost conditional here limits things to HVM. Using different
> predicates of course obfuscates this some, but bringing those two
> closer together (perhaps even merging them) didn't look reasonable
> to do right here.

Ah, that bit.  Also further obfuscated by partial nested !'s.

I doubt Shadow has seen anything beyond token testing in combination
with PCI Passthrough.  It certainly saw no testing under XenServer.

>> It can also half its number of external calls by rearranging the if/else
>> chain and making better use of the type variable.
> I did actually spend quite a bit of time to see whether I could figure
> a valid way of re-arranging the order, but in the end for every
> transformation I found a reason why it wouldn't be valid. So I'm
> curious what valid simplification(s) you see.

Well, the first two calls calls to pat_type_2_pte_flags() can be merged
quite easily, but I was also thinking in terms of future where coherency
handling was working in a more sane way.

>>> @@ -571,7 +571,7 @@ _sh_propagate(struct vcpu *v,
>>>                              gfn_to_paddr(target_gfn),
>>>                              mfn_to_maddr(target_mfn),
>>>                              X86_MT_UC);
>>> -                else if ( iommu_snoop )
>>> +                else if ( is_iommu_enabled(d) && iommu_snoop )
>>>                      sflags |= pat_type_2_pte_flags(X86_MT_WB);
>> Hmm...  This is still one reasonably expensive nop; the PTE Flags for WB
>> are 0.
> Right, but besides being unrelated to the patch (there's a following
> "else", so the condition cannot be purged altogether) I would wonder
> if we really want to bake in more PAT layout <-> PTE dependencies.

I'm not advocating for more assumption about PAT <-> PTE layout, but it
would be nice if the NOPs were actually NOPs.

I submitted a patch which makes pat_type_2_pte_flags() marginally less
expensive, but there's still massive savings to be made here.  Because
XEN's PAT is compile time constant, this inverse can be too.

>
>>> --- a/xen/drivers/passthrough/x86/iommu.c
>>> +++ b/xen/drivers/passthrough/x86/iommu.c
>>> @@ -56,6 +56,13 @@ void __init acpi_iommu_init(void)
>>>      if ( !acpi_disabled )
>>>      {
>>>          ret = acpi_dmar_init();
>>> +
>>> +#ifndef iommu_snoop
>>> +        /* A command line override for snoop control affects VT-d only. */
>>> +        if ( ret )
>>> +            iommu_snoop = true;
>>> +#endif
>> I really don't think this is a good idea.  If nothing else, you're
>> reinforcing the notion that this logic is somehow acceptable.
>>
>> If instead the comment read something like:
>>
>> /* This logic is pretty bogus, but necessary for now.  iommu_snoop as a
>> control is only wired up for VT-d (which may be conditionally compiled
>> out), and while AMD can control coherency, Xen forces coherent accesses
>> unilaterally so iommu_snoop needs to report true on all AMD systems for
>> logic elsewhere in Xen to behave correctly. */
> I've extended the comment to this:
>
>         /*
>          * As long as there's no per-domain snoop control, and as long as on
>          * AMD we uniformly force coherent accesses, a possible command line
>          * override should affect VT-d only.
>          */

Better.  I suppose my displeasure of this can live on list...

~Andrew

Reply via email to