RE: [tip:x86/platform] x86/hyper-v: Use hypercall for remote TLB flush

From: Jork Loeser
Date: Thu Aug 10 2017 - 21:15:27 EST


> -----Original Message-----
> From: Peter Zijlstra [mailto:peterz@xxxxxxxxxxxxx]
> Sent: Thursday, August 10, 2017 12:28
> To: Jork Loeser <Jork.Loeser@xxxxxxxxxxxxx>
> Cc: KY Srinivasan <kys@xxxxxxxxxxxxx>; Simon Xiao <sixiao@xxxxxxxxxxxxx>;
> Haiyang Zhang <haiyangz@xxxxxxxxxxxxx>; Stephen Hemminger
> <sthemmin@xxxxxxxxxxxxx>; torvalds@xxxxxxxxxxxxxxxxxxxx; luto@xxxxxxxxxx;
> hpa@xxxxxxxxx; vkuznets@xxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> rostedt@xxxxxxxxxxx; andy.shevchenko@xxxxxxxxx; tglx@xxxxxxxxxxxxx;
> mingo@xxxxxxxxxx; linux-tip-commits@xxxxxxxxxxxxxxx
> Subject: Re: [tip:x86/platform] x86/hyper-v: Use hypercall for remote TLB flush

> > > > Hold on.. if we don't IPI for TLB invalidation. What serializes
> > > > our software page table walkers like fast_gup() ?
> > >
> > > Hypervisor may implement this functionality via an IPI.
> > >
> > > K. Y
> >
> > HvFlushVirtualAddressList() states:
> > This call guarantees that by the time control returns back to the
> > caller, the observable effects of all flushes on the specified virtual
> > processors have occurred.
> >
> > HvFlushVirtualAddressListEx() refers to HvFlushVirtualAddressList() as adding
> sparse target VP lists.
> >
> > Is this enough of a guarantee, or do you see other races?
>
> That's nowhere near enough. We need the remote CPU to have completed any
> guest IF section that was in progress at the time of the call.
>
> So if a host IPI can interrupt a guest while the guest has IF cleared, and we then
> process the host IPI -- clear the TLBs -- before resuming the guest, which still has
> IF cleared, we've got a problem.
>
> Because at that point, our software page-table walker, that relies on IF being
> clear to guarantee the page-tables exist, because it holds off the TLB invalidate
> and thereby the freeing of the pages, gets its pages ripped out from under it.

I see, IF is used as a locking mechanism for the pages. Would CONFIG_HAVE_RCU_TABLE_FREE be an option for x86? There are caveats (statically enabled, RCU for page-free), yet if the resulting perf is still a gain it would be worthwhile for Hyper-V targeted kernels.

Regards,
Jork