Re: [PATCH v2] kprobes/x86: Use 5-byte NOP when the code might be modified by ftrace

From: Petr Mladek
Date: Tue Feb 03 2015 - 07:00:20 EST


On Tue 2015-02-03 12:38:28, Petr Mladek wrote:
> On Tue 2015-02-03 16:41:39, Masami Hiramatsu wrote:
> > (2015/02/03 2:48), Petr Mladek wrote:
> > > can_probe() checks if the given address points to the beginning of
> > > an instruction. It analyzes all the instructions from the beginning
> > > of the function until the given address. The code might be modified
> > > by another Kprobe. In this case, the current code is read into a buffer,
> > > int3 breakpoint is replaced by the saved opcode in the buffer, and
> > > can_probe() analyzes the buffer instead.
> > >
> > > There is a bug that __recover_probed_insn() tries to restore
> > > the original code even for Kprobes using the ftrace framework.
> > > But in this case, the opcode is not stored. See the difference
> > > between arch_prepare_kprobe() and arch_prepare_kprobe_ftrace().
> > > The opcode is stored by arch_copy_kprobe() only from
> > > arch_prepare_kprobe().
> > >
> > > This patch makes Kprobe to use the ideal 5-byte NOP when the code
> > > can be modified by ftrace. It is the original instruction, see
> > > ftrace_make_nop() and ftrace_nop_replace().
> > >
> > > Note that we always need to use the NOP for ftrace locations. Kprobes
> > > do not block ftrace and the instruction might get modified at anytime.
> > > It might even be in an inconsistent state because it is modified step
> > > by step using the int3 breakpoint.
> > >
> > > The patch also fixes indentation of the touched comment.
> > >
> > > Note that I found this problem when playing with Kprobes. I did it
> > > on x86_64 with gcc-4.8.3 that supported -mfentry. I modified
> > > samples/kprobes/kprobe_example.c and added offset 5 to put
> > > the probe right after the fentry area:
> > >
> > > --- cut ---
> > > static struct kprobe kp = {
> > > .symbol_name = "do_fork",
> > > + .offset = 5,
> > > };
> > > --- cut ---
> > >
> > > Then I was able to load kprobe_example before jprobe_example
> > > but not the other way around:
> > >
> > > $> modprobe jprobe_example
> > > $> modprobe kprobe_example
> > > modprobe: ERROR: could not insert 'kprobe_example': Invalid or incomplete multibyte or wide character
> > >
> > > It did not make much sense and debugging pointed to the bug
> > > described above.
> > >
> >
> > This looks good to me :)
> >
> > Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@xxxxxxxxxxx>
> >
> > Ingo, could you merge this as an urgent fix?
>
> Please, wait a bit, see below.
>
> > Thank you!
> >
> > > Signed-off-by: Petr Mladek <pmladek@xxxxxxx>
> > > ---
> > > arch/x86/kernel/kprobes/core.c | 42 ++++++++++++++++++++++++++++--------------
> > > 1 file changed, 28 insertions(+), 14 deletions(-)
> > >
> > > Changes against v1:
> > >
> > > + always use 5-byte NOP for ftrace location
> > > + fix indentation of the touched comment
> > >
> > > diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
> > > index 98f654d466e5..2f464b56766a 100644
> > > --- a/arch/x86/kernel/kprobes/core.c
> > > +++ b/arch/x86/kernel/kprobes/core.c
> > > @@ -223,27 +223,41 @@ static unsigned long
> > > __recover_probed_insn(kprobe_opcode_t *buf, unsigned long addr)
> > > {
> > > struct kprobe *kp;
> > > + unsigned long faddr;
> > >
> > > kp = get_kprobe((void *)addr);
> > > - /* There is no probe, return original address */
> > > - if (!kp)
> > > + faddr = ftrace_location(addr);
>
> I have just realized that ftrace_location() might return another
> address if the given one points inside the ftrace_location.
> This situation is not checked by this patch. I am going to work
> on v3.

Well, it should not happen after all because __recover_probed_insn() is called
only for already approved Kprobe locations and therefore only for the first
byte of the ftrace location. Any address inside the ftrace location is
refused earlier by check_kprobe_address_safe() that is called from
register_kprobe.

It means that it will never return another address here and the patch
can be used as is unless you want to be paranoid.

I am sorry for the rumor.

Best Regards,
Petr

> > > + /*
> > > + * Use the current code if it is not modified by Kprobe
> > > + * and it cannot be modified by ftrace.
> > > + */
> > > + if (!kp && !faddr)
> > > return addr;
> > >
> > > /*
> > > - * Basically, kp->ainsn.insn has an original instruction.
> > > - * However, RIP-relative instruction can not do single-stepping
> > > - * at different place, __copy_instruction() tweaks the displacement of
> > > - * that instruction. In that case, we can't recover the instruction
> > > - * from the kp->ainsn.insn.
> > > + * Basically, kp->ainsn.insn has an original instruction.
> > > + * However, RIP-relative instruction can not do single-stepping
> > > + * at different place, __copy_instruction() tweaks the displacement of
> > > + * that instruction. In that case, we can't recover the instruction
> > > + * from the kp->ainsn.insn.
> > > *
> > > - * On the other hand, kp->opcode has a copy of the first byte of
> > > - * the probed instruction, which is overwritten by int3. And
> > > - * the instruction at kp->addr is not modified by kprobes except
> > > - * for the first byte, we can recover the original instruction
> > > - * from it and kp->opcode.
> > > + * On the other hand, in case on normal Kprobe, kp->opcode has a copy
> > > + * of the first byte of the probed instruction, which is overwritten
> > > + * by int3. And the instruction at kp->addr is not modified by kprobes
> > > + * except for the first byte, we can recover the original instruction
> > > + * from it and kp->opcode.
> > > + *
> > > + * In case of Kprobes using ftrace, we do not have a copy of
> > > + * the original instruction. In fact, the ftrace location might
> > > + * be modified at anytime and even could be in an inconsistent state.
> > > + * Fortunately, we know that the original code is the ideal 5-byte
> > > + * long NOP.
> > > */
> > > - memcpy(buf, kp->addr, MAX_INSN_SIZE * sizeof(kprobe_opcode_t));
> > > - buf[0] = kp->opcode;
> > > + memcpy(buf, (void *)addr, MAX_INSN_SIZE * sizeof(kprobe_opcode_t));
> > > + if (faddr)
> > > + memcpy(buf, ideal_nops[NOP_ATOMIC5], MCOUNT_INSN_SIZE);
> > > + else
> > > + buf[0] = kp->opcode;
> > > return (unsigned long)buf;
> > > }
> > >
> > >
> >
> >
> > --
> > Masami HIRAMATSU
> > Software Platform Research Dept. Linux Technology Research Center
> > Hitachi, Ltd., Yokohama Research Laboratory
> > E-mail: masami.hiramatsu.pt@xxxxxxxxxxx
> >
> >
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/