[PATCH 0/4] [RFC][v4] Workaround for Xeon Phi PTE A/D bits erratum

From: Dave Hansen
Date: Fri Jul 01 2016 - 13:48:01 EST


This is very lightly tested. I haven't even run it on the affected
hardware. Just sending it quickly in case someone can easily see
something fatally wrong with it.

This seems a lot less fragile than the previous patches that relied
on TLB flushing. Those seemed like it would be easy to add new code
that hit this issue.

The new approach seems like it'll be harder to break. The most
likely thing to break would be someone looking for a zero pte_val()
and seeing a stray bit. But that seems like a better alternative
than the nastiness that could happen if one of these bits *is*
considered when reading a swap PTE.

--

The Intel(R) Xeon Phi(TM) Processor x200 Family (codename: Knights
Landing) has an erratum where a processor thread setting the Accessed
or Dirty bits may not do so atomically against its checks for the
Present bit. This may cause a thread (which is about to page fault)
to set A and/or D, even though the Present bit had already been
atomically cleared.

If the PTE is used for storing a swap index or a NUMA migration index,
the A bit could be misinterpreted as part of the swap type. The stray
bits being set cause a software-cleared PTE to be interpreted as a
swap entry. In some cases (like when the swap index ends up being
for a non-existent swapfile), the kernel detects the stray value
and WARN()s about it, but there is no guarantee that the kernel can
always detect it.

This patch changes the kernel to attempt to ignore those stray bits
when they get set. We do this by making our swap PTE format
completely ignore the A/D bits, and also by ignoring them in our
pte_none() checks.

Andi Kleen wrote the original version of this patch. Dave Hansen
wrote the later ones.

v4: complete rework: let the bad bits stay around, but try to
ignore them
v3: huge rework to keep batching working in unmap case
v2: out of line. avoid single thread flush. cover more clear
cases