[RFC][PATCH 8/8] x86/mm: remove spurious fault pkey check

From: Dave Hansen
Date: Fri Sep 07 2018 - 15:51:59 EST



From: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>

Spurious faults only ever occur in the kernel's address space. They
are also constrained specifically to faults with one of these error codes:

X86_PF_WRITE | X86_PF_PROT
X86_PF_INSTR | X86_PF_PROT

So, it's never even possible to reach spurious_kernel_fault_check() with
X86_PF_PK set.

In addition, the kernel's address space never has pages with user-mode
protections. Protection Keys are only enforced on pages with user-mode
protection.

This gives us lots of reasons to not check for protection keys in our
sprurious kernel fault handling.

But, let's also add some warnings to ensure that these assumptions about
protection keys hold true.

Signed-off-by: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
Cc: Sean Christopherson <sean.j.christopherson@xxxxxxxxx>
Cc: "Peter Zijlstra (Intel)" <peterz@xxxxxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: x86@xxxxxxxxxx
Cc: Andy Lutomirski <luto@xxxxxxxxxx>
---

b/arch/x86/mm/fault.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)

diff -puN arch/x86/mm/fault.c~pkeys-fault-warnings arch/x86/mm/fault.c
--- a/arch/x86/mm/fault.c~pkeys-fault-warnings 2018-09-07 12:32:23.190741335 -0700
+++ b/arch/x86/mm/fault.c 2018-09-07 12:32:23.194741335 -0700
@@ -1037,12 +1037,6 @@ static int spurious_kernel_fault_check(u

if ((error_code & X86_PF_INSTR) && !pte_exec(*pte))
return 0;
- /*
- * Note: We do not do lazy flushing on protection key
- * changes, so no spurious fault will ever set X86_PF_PK.
- */
- if ((error_code & X86_PF_PK))
- return 1;

return 1;
}
@@ -1213,6 +1207,13 @@ do_kern_addr_space_fault(struct pt_regs
unsigned long address)
{
/*
+ * Protection keys exceptions only happen on user pages. We
+ * have no user pages in the kernel portion of the address
+ * space, so do not expect them here.
+ */
+ WARN_ON_ONCE(hw_error_code & X86_PF_PK);
+
+ /*
* We can fault-in kernel-space virtual memory on-demand. The
* 'reference' page table is init_mm.pgd.
*
_