Re: kprobes: propagate error from arm_kprobe_ftrace()

From: Jessica Yu
Date: Sat Oct 07 2017 - 06:52:43 EST


+++ Masami Hiramatsu [05/10/17 06:23 +0000]:
Hi Jessica,

On Wed, 4 Oct 2017 21:14:13 +0200
Jessica Yu <jeyu@xxxxxxxxxx> wrote:

Improve error handling when arming ftrace-based kprobes. Specifically, if
we fail to arm a ftrace-based kprobe, register_kprobe()/enable_kprobe()
should report an error instead of success. Previously, this has lead to
confusing situations where register_kprobe() would return 0 indicating
success, but the kprobe would not be functional if ftrace registration
during the kprobe arming process had failed. We should therefore take any
errors returned by ftrace into account and propagate this error so that we
do not register/enable kprobes that cannot be armed. This can happen if,
for example, register_ftrace_function() finds an IPMODIFY conflict (since
kprobe_ftrace_ops has this flag set) and returns an error. Such a conflict
is possible since livepatches also set the IPMODIFY flag for their ftrace_ops.

arm_all_kprobes() keeps its current behavior and attempts to arm all
kprobes. It returns the last encountered error and gives a warning if
not all kprobes could be armed.

This patch is based on Petr Mladek's original patchset (patches 2 and 3)
back in 2015, which improved kprobes error handling, found here:

https://lkml.org/lkml/2015/2/26/452

However, further work on this had been paused since then and the patches
were not upstreamed.

Ok, I have some comment. See below.


Based-on-patches-by: Petr Mladek <pmladek@xxxxxxxx>
Signed-off-by: Jessica Yu <jeyu@xxxxxxxxxx>
---
kernel/kprobes.c | 87 +++++++++++++++++++++++++++++++++++++++-----------------
1 file changed, 61 insertions(+), 26 deletions(-)

diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 2d28377a0e32..6e889be0d93c 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -979,18 +979,27 @@ static int prepare_kprobe(struct kprobe *p)
}

/* Caller must lock kprobe_mutex */
-static void arm_kprobe_ftrace(struct kprobe *p)
+static int arm_kprobe_ftrace(struct kprobe *p)
{
- int ret;
+ int ret = 0;

ret = ftrace_set_filter_ip(&kprobe_ftrace_ops,
(unsigned long)p->addr, 0, 0);
- WARN(ret < 0, "Failed to arm kprobe-ftrace at %p (%d)\n", p->addr, ret);
- kprobe_ftrace_enabled++;
- if (kprobe_ftrace_enabled == 1) {
+ if (WARN(ret < 0, "Failed to arm kprobe-ftrace at %p (%d)\n", p->addr, ret))
+ return ret;
+
+ if (kprobe_ftrace_enabled == 0) {
ret = register_ftrace_function(&kprobe_ftrace_ops);
- WARN(ret < 0, "Failed to init kprobe-ftrace (%d)\n", ret);
+ if (WARN(ret < 0, "Failed to init kprobe-ftrace (%d)\n", ret))
+ goto err_ftrace;
}
+
+ kprobe_ftrace_enabled++;
+ return ret;
+
+err_ftrace:
+ ftrace_set_filter_ip(&kprobe_ftrace_ops, (unsigned long)p->addr, 1, 0);
+ return ret;
}

/* Caller must lock kprobe_mutex */
@@ -1009,22 +1018,23 @@ static void disarm_kprobe_ftrace(struct kprobe *p)
}
#else /* !CONFIG_KPROBES_ON_FTRACE */
#define prepare_kprobe(p) arch_prepare_kprobe(p)
-#define arm_kprobe_ftrace(p) do {} while (0)
+#define arm_kprobe_ftrace(p) (0)
#define disarm_kprobe_ftrace(p) do {} while (0)
#endif

/* Arm a kprobe with text_mutex */
-static void arm_kprobe(struct kprobe *kp)
+static int arm_kprobe(struct kprobe *kp)
{
- if (unlikely(kprobe_ftrace(kp))) {
- arm_kprobe_ftrace(kp);
- return;
- }
+ if (unlikely(kprobe_ftrace(kp)))
+ return arm_kprobe_ftrace(kp);
+
cpus_read_lock();
mutex_lock(&text_mutex);
__arm_kprobe(kp);
mutex_unlock(&text_mutex);
cpus_read_unlock();
+
+ return 0;
}

/* Disarm a kprobe with text_mutex */
@@ -1363,9 +1373,14 @@ static int register_aggr_kprobe(struct kprobe *orig_p, struct kprobe *p)

if (ret == 0 && kprobe_disabled(ap) && !kprobe_disabled(p)) {
ap->flags &= ~KPROBE_FLAG_DISABLED;
- if (!kprobes_all_disarmed)
+ if (!kprobes_all_disarmed) {
/* Arm the breakpoint again. */
- arm_kprobe(ap);
+ ret = arm_kprobe(ap);
+ if (ret) {
+ ap->flags |= KPROBE_FLAG_DISABLED;
+ list_del_rcu(&p->list);

Nice catch :) this list_del_rcu() is important to keep error case
behavior sane.

+ }
+ }
}
return ret;
}
@@ -1570,13 +1585,16 @@ int register_kprobe(struct kprobe *p)
if (ret)
goto out;

+ if (!kprobes_all_disarmed && !kprobe_disabled(p)) {
+ ret = arm_kprobe(p);
+ if (ret)
+ goto out;
+ }
+

No, this is no good. It is a small chance to hit kprobe on other
CPUs before adding it to kprobe_table hashlist. In that case,
we will see a stray breakpoint instruction.

Ah yes, you are right, this is incorrect. There is a short window of
time where we could have a stray breakpoint from an armed kprobe, but
the breakpoint handler would not be able to find the associated kprobe
in the hashlist. Will fix this in v2.

INIT_HLIST_NODE(&p->hlist);
hlist_add_head_rcu(&p->hlist,
&kprobe_table[hash_ptr(p->addr, KPROBE_HASH_BITS)]);

- if (!kprobes_all_disarmed && !kprobe_disabled(p))
- arm_kprobe(p);
-

So, you'll have to rollback by hlist_del_rcu() here.
Hmm, by the way, in this case, you also have to add a synchronize_rcu()
in the end of error path, so that user can release kprobe right after
error return of register_kprobe... (I think that's OK because it is not
a hot path)

Yes, I'll fix this in the error path as well. Thank you for your
comments! Will send a v2.

Jessica