On Tue, Jun 30, 2020 at 02:13:36PM +0200, Jan Beulich wrote:
> On 26.06.2020 17:57, Roger Pau Monne wrote:
> > Commit e9aca9470ed86 introduced a regression when avoiding sending
> > IPIs for certain flush operations. Xen page fault handler
> > (spurious_page_fault) relies on blocking interrupts in order to
> > prevent handling TLB flush IPIs and thus preventing other CPUs from
> > removing page tables pages. Switching to assisted flushing avoided such
> > IPIs, and thus can result in pages belonging to the page tables being
> > removed (and possibly re-used) while __page_fault_type is being
> > executed.
> > 
> > Force some of the TLB flushes to use IPIs, thus avoiding the assisted
> > TLB flush. Those selected flushes are the page type change (when
> > switching from a page table type to a different one, ie: a page that
> > has been removed as a page table) and page allocation. This sadly has
> > a negative performance impact on the pvshim, as less assisted flushes
> > can be used. Note the flush in grant-table code is also switched to
> > use an IPI even when not strictly needed. This is done so that a
> > common arch_flush_tlb_mask can be introduced and always used in common
> > code.
> > 
> > Introduce a new flag (FLUSH_FORCE_IPI) and helper to force a TLB flush
> > using an IPI (flush_tlb_mask_sync, x86 only). Note that the flag is
> > only meaningfully defined when the hypervisor supports PV or shadow
> > paging mode, as otherwise hardware assisted paging domains are in
> > charge of their page tables and won't share page tables with Xen, thus
> > not influencing the result of page walks performed by the spurious
> > fault handler.
> > 
> > Just passing this new flag when calling flush_area_mask prevents the
> > usage of the assisted flush without any other side effects.
> > 
> > Note the flag is not defined on Arm.
> > 
> > Fixes: e9aca9470ed86 ('x86/tlb: use Xen L0 assisted TLB flush when 
> > available')
> > Reported-by: Andrew Cooper <andrew.coop...@citrix.com>
> > Signed-off-by: Roger Pau Monné <roger....@citrix.com>
> 
> In principle
> Reviewed-by: Jan Beulich <jbeul...@suse.com>
> A few cosmetic remarks though:
> 
> > --- a/xen/arch/x86/mm.c
> > +++ b/xen/arch/x86/mm.c
> > @@ -2894,7 +2894,17 @@ static int _get_page_type(struct page_info *page, 
> > unsigned long type,
> >                        ((nx & PGT_type_mask) == PGT_writable_page)) )
> >                  {
> >                      perfc_incr(need_flush_tlb_flush);
> > -                    flush_tlb_mask(mask);
> > +                    if ( (x & PGT_type_mask) &&
> > +                         (x & PGT_type_mask) <= PGT_root_page_table )
> > +                        /*
> > +                         * If page was a page table make sure the flush is
> > +                         * performed using an IPI in order to avoid 
> > changing
> > +                         * the type of a page table page under the feet of
> > +                         * spurious_page_fault.
> > +                         */
> > +                        flush_tlb_mask_sync(mask);
> > +                    else
> > +                        flush_tlb_mask(mask);
> 
> Effectively this now is the only user of the new macro. I'd prefer
> avoiding its introduction (and hence avoiding the questionable
> "_sync" suffix), doing
> 
>     flush_mask(mask, FLUSH_TLB | (... ? FLUSH_FORCE_IPI : 0));

Right, maybe placing the '(x & PGT_type_mask) && (x & PGT_type_mask) <=
PGT_root_page_table' condition inside the parameter list of flush_mask
will make the code hard to read, so it might be worth to keep the
if?

> here and ...
> 
> > @@ -148,9 +158,24 @@ void flush_area_mask(const cpumask_t *, const void 
> > *va, unsigned int flags);
> >  /* Flush specified CPUs' TLBs */
> >  #define flush_tlb_mask(mask)                    \
> >      flush_mask(mask, FLUSH_TLB)
> > +/*
> > + * Flush specified CPUs' TLBs and force the usage of an IPI to do so. This 
> > is
> > + * required for certain operations that rely on page tables themselves not
> > + * being freed and reused when interrupts are blocked, as the flush IPI 
> > won't
> > + * be fulfilled until exiting from that critical region.
> > + */
> > +#define flush_tlb_mask_sync(mask)               \
> > +    flush_mask(mask, FLUSH_TLB | FLUSH_FORCE_IPI)
> >  #define flush_tlb_one_mask(mask,v)              \
> >      flush_area_mask(mask, (const void *)(v), FLUSH_TLB|FLUSH_ORDER(0))
> >  
> > +/*
> > + * Alias the common code TLB flush helper to the sync one in order to be 
> > on the
> > + * safe side. Note that not all calls from common code strictly require the
> > + * _sync variant.
> > + */
> > +#define arch_flush_tlb_mask flush_tlb_mask_sync
> 
> ...
> 
> #define arch_flush_tlb_mask(mask)               \
>     flush_mask(mask, FLUSH_TLB | FLUSH_FORCE_IPI)

Sure. Feel free to slightly adjust the comment, I think doing
s/Alias/Force/ would be enough.

> here. I'd be okay making these adjustments while committing, if
> you and others don't object.

That's fine, I leave to your judgment whether to use the ternary
operator in the _get_page_type case.

Roger.

Reply via email to