Re: [PATCH 3/4] mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE

From: Benjamin Herrenschmidt
Date: Thu Aug 23 2018 - 00:55:08 EST


On Wed, 2018-08-22 at 20:59 -0700, Linus Torvalds wrote:
> On Wed, Aug 22, 2018 at 8:45 PM Nicholas Piggin <npiggin@xxxxxxxxx> wrote:
> >
> > powerpc/radix has no such issue, it already does this tracking.
>
> Yeah, I now realize that this was why you wanted to add that hacky
> thing to the generic code, so that you can add the tlb_flush_pgtable()
> call.
>
> I thought it was because powerpc had some special flush instruction
> for it, and the regular tlb flush didn't do it. But no. It was because
> the regular code had lost the tlb flush _entirely_, because powerpc
> didn't want it.

Heh :-) Well, back on hash we didn't (we do now with Radix) but I
wouldn't blame us for the generic code being broken ... the RCU table
freeing was in arch/powerpc at the time :-) I don't think it was us
making it generic :)

> > We were discussing this a couple of months ago, I wasn't aware of ARM's
> > issue but I suggested x86 could go the same way as powerpc.
>
> The problem is that x86 _used_ to do this all correctly long long ago.
>
> And then we switched over to the "generic" table flushing (which
> harkens back to the powerpc code).

Yes, we wrote it the RCU stuff to solve the races with SW walking,
which is completely orthogonal with HW walking & TLB content. We didn't
do the move to generic code though ;-)

> Which actually turned out to be not generic at all, and did not flush
> the internal pages like x86 used to (back when x86 just used
> tlb_remove_page for everything).

Well, having RCU do the flushing is rather generic, it makes sense
whenever there's somebody doing a SW walk *and* you don't have IPIs to
synchronize your flushes (ie, anybody with HW TLB invalidation
broadcast basically, so ARM and us).

> So as a result, x86 had unintentionally lost the TLB flush we used to
> have, because tlb_remove_table() had lost the tlb flushing because of
> a powerpc quirk.

This is a somewhat odd way of putting the "blame" :-) But yeah ok...

> You then added it back as a hacky per-architecture hook (apparently
> having realized that you never did it at all), which didn't fix the
> unintentional lack of flushing on x86.
>
> So now we're going to do it right. No more "oh, powerpc didn't need
> to flush because the hash tables weren't in the tlb at all" thing in
> the generic code that then others need to work around.

So we do need a different flush instruction for the page tables vs. the
normal TLB pages.

Cheers,
Ben.