[PATCH] x86: Further robustify CR2 handling vs tracing

From: Peter Zijlstra
Date: Thu Mar 06 2014 - 09:53:31 EST


Subject: x86: Further robustify CR2 handling vs tracing
From: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Date: Wed, 5 Mar 2014 14:07:49 +0100

Building on commit 0ac09f9f8cd1 ("x86, trace: Fix CR2 corruption when
tracing page faults") this patch addresses another few issues:

- Now that read_cr2() is lifted into trace_do_page_fault(), we should
pass the address to trace_page_fault_entries() to avoid it
re-reading a potentially changed cr2.

- Put both trace_do_page_fault() and trace_page_fault_entries() under
CONFIG_TRACING.

- Mark both fault entry functions {,trace_}do_page_fault() as notrace
to avoid getting __mcount or other function entry trace callbacks
before we've observed CR2.

- Mark __do_page_fault() as noinline to guarantee the function tracer
does get to see the fault.

Cc: jolsa@xxxxxxxxxx
Cc: vincent.weaver@xxxxxxxxx
Cc: tglx@xxxxxxxxxxxxx
Cc: hpa@xxxxxxxxxxxxxxx
Cc: mingo@xxxxxxxxxx
Acked-by: Steven Rostedt <rostedt@xxxxxxxxxxx>
Signed-off-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
---
arch/x86/mm/fault.c | 33 +++++++++++++++++++++++----------
1 file changed, 23 insertions(+), 10 deletions(-)

--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -1020,8 +1020,12 @@ static inline bool smap_violation(int er
* This routine handles page faults. It determines the address,
* and the problem, and then passes it off to one of the appropriate
* routines.
+ *
+ * This function must have noinline because both callers
+ * {,trace_}do_page_fault() have notrace on. Having this an actual function
+ * guarantees there's a function trace entry.
*/
-static void __kprobes
+static void __kprobes noinline
__do_page_fault(struct pt_regs *regs, unsigned long error_code,
unsigned long address)
{
@@ -1245,31 +1249,38 @@ __do_page_fault(struct pt_regs *regs, un
up_read(&mm->mmap_sem);
}

-dotraplinkage void __kprobes
+dotraplinkage void __kprobes notrace
do_page_fault(struct pt_regs *regs, unsigned long error_code)
{
+ unsigned long address = read_cr2(); /* Get the faulting address */
enum ctx_state prev_state;
- /* Get the faulting address: */
- unsigned long address = read_cr2();
+
+ /*
+ * We must have this function tagged with __kprobes, notrace and call
+ * read_cr2() before calling anything else. To avoid calling any kind
+ * of tracing machinery before we've observed the CR2 value.
+ *
+ * exception_{enter,exit}() contain all sorts of tracepoints.
+ */

prev_state = exception_enter();
__do_page_fault(regs, error_code, address);
exception_exit(prev_state);
}

-static void trace_page_fault_entries(struct pt_regs *regs,
+#ifdef CONFIG_TRACING
+static void trace_page_fault_entries(unsigned long address, struct pt_regs *regs,
unsigned long error_code)
{
if (user_mode(regs))
- trace_page_fault_user(read_cr2(), regs, error_code);
+ trace_page_fault_user(address, regs, error_code);
else
- trace_page_fault_kernel(read_cr2(), regs, error_code);
+ trace_page_fault_kernel(address, regs, error_code);
}

-dotraplinkage void __kprobes
+dotraplinkage void __kprobes notrace
trace_do_page_fault(struct pt_regs *regs, unsigned long error_code)
{
- enum ctx_state prev_state;
/*
* The exception_enter and tracepoint processing could
* trigger another page faults (user space callchain
@@ -1277,9 +1288,11 @@ trace_do_page_fault(struct pt_regs *regs
* the faulting address now.
*/
unsigned long address = read_cr2();
+ enum ctx_state prev_state;

prev_state = exception_enter();
- trace_page_fault_entries(regs, error_code);
+ trace_page_fault_entries(address, regs, error_code);
__do_page_fault(regs, error_code, address);
exception_exit(prev_state);
}
+#endif /* CONFIG_TRACING */
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/