Re: [RFC 13/13] x86/mm: Try to preserve old TLB entries using PCID

From: Andy Lutomirski
Date: Fri Jan 08 2016 - 21:20:24 EST


On Fri, Jan 8, 2016 at 4:27 PM, Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx> wrote:
> On 01/08/2016 03:15 PM, Andy Lutomirski wrote:
>> + * The guiding principle of this code is that TLB entries that have
>> + * survived more than a small number of context switches are mostly
>> + * useless, so we don't try very hard not to evict them.
>
> Big ack on that. The original approach tried to keep track of the full
> 4k worth of possible PCIDs, it also needed an additional cpumask (which
> it dynamically allocated) for where the PCID was active in addition to
> the normal "where has this mm been" mask.

My patch has a similar extra cpumask, but at least I didn't
dynamically allocate it. I did it because I need a 100% reliable way
to tell whether a given mm has a valid PCID in a cpu's PCID LRU list,
as opposed to just matching due to struct mm reuse or similar. I also
need the ability to blow away old mappings, which I can do by clearing
the cpumask. This happens in init_new_context and in
propagate_tlb_flush.

The other way to do it would be to store some kind of generation
counter in the per-cpu list. I could use a global 64-bit atomic
counter to allocate never-reused mm ids (it's highly unlikely that a
system will run long enough for such a counter to overflow -- it could
only ever be incremented every few ns, giving hundreds of years of
safety), but that's kind of expensive. I could use a per-cpu
allocator, but 54 bits per cpu is uncomfortably small unless we have
wraparound handling. We could do 64 bits per cpu for very cheap
counter allocation, but then the "zap the pcid" logic gets much
nastier in that neither the percpu entries nor the per-mm generation
counter entries don't fit in a word any more. Maybe that's fine.

What we can't do easily is have a per-mm generation counter, because
freeing an mm and reallocating it in the same place needs to reliably
zap the pcid on all CPUs.

Anyway, this problem is clearly solvable, but I haven't thought of a
straightforward solution that doesn't involve rarely-executed code
paths, and that makes me a bit nervous.

--Andy