[tip:x86/mm] bpf: Fail bpf_probe_write_user() while mm is switched

From: tip-bot for Nadav Amit
Date: Tue Apr 30 2019 - 07:17:20 EST


Commit-ID: c7b6f29b6257532792fc722b68fcc0e00b5a856c
Gitweb: https://git.kernel.org/tip/c7b6f29b6257532792fc722b68fcc0e00b5a856c
Author: Nadav Amit <namit@xxxxxxxxxx>
AuthorDate: Thu, 25 Apr 2019 17:11:43 -0700
Committer: Ingo Molnar <mingo@xxxxxxxxxx>
CommitDate: Tue, 30 Apr 2019 12:37:48 +0200

bpf: Fail bpf_probe_write_user() while mm is switched

When using a temporary mm, bpf_probe_write_user() should not be able to
write to user memory, since user memory addresses may be used to map
kernel memory. Detect these cases and fail bpf_probe_write_user() in
such cases.

Suggested-by: Jann Horn <jannh@xxxxxxxxxx>
Reported-by: Jann Horn <jannh@xxxxxxxxxx>
Signed-off-by: Nadav Amit <namit@xxxxxxxxxx>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@xxxxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Cc: <akpm@xxxxxxxxxxxxxxxxxxxx>
Cc: <ard.biesheuvel@xxxxxxxxxx>
Cc: <deneen.t.dock@xxxxxxxxx>
Cc: <kernel-hardening@xxxxxxxxxxxxxxxxxx>
Cc: <kristen@xxxxxxxxxxxxxxx>
Cc: <linux_dti@xxxxxxxxxx>
Cc: <will.deacon@xxxxxxx>
Cc: Alexei Starovoitov <ast@xxxxxxxxxx>
Cc: Andy Lutomirski <luto@xxxxxxxxxx>
Cc: Borislav Petkov <bp@xxxxxxxxx>
Cc: Daniel Borkmann <daniel@xxxxxxxxxxxxx>
Cc: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
Cc: H. Peter Anvin <hpa@xxxxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Cc: Rik van Riel <riel@xxxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Link: https://lkml.kernel.org/r/20190426001143.4983-24-namit@xxxxxxxxxx
Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
---
kernel/trace/bpf_trace.c | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index d64c00afceb5..94b0e37d90ef 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -14,6 +14,8 @@
#include <linux/syscalls.h>
#include <linux/error-injection.h>

+#include <asm/tlb.h>
+
#include "trace_probe.h"
#include "trace.h"

@@ -163,6 +165,10 @@ BPF_CALL_3(bpf_probe_write_user, void *, unsafe_ptr, const void *, src,
* access_ok() should prevent writing to non-user memory, but in
* some situations (nommu, temporary switch, etc) access_ok() does
* not provide enough validation, hence the check on KERNEL_DS.
+ *
+ * nmi_uaccess_okay() ensures the probe is not run in an interim
+ * state, when the task or mm are switched. This is specifically
+ * required to prevent the use of temporary mm.
*/

if (unlikely(in_interrupt() ||
@@ -170,6 +176,8 @@ BPF_CALL_3(bpf_probe_write_user, void *, unsafe_ptr, const void *, src,
return -EPERM;
if (unlikely(uaccess_kernel()))
return -EPERM;
+ if (unlikely(!nmi_uaccess_okay()))
+ return -EPERM;
if (!access_ok(unsafe_ptr, size))
return -EPERM;