Re: [PATCH 2/5] uprobes: introduce uprobe_switch_to()

From: Peter Zijlstra
Date: Wed Nov 30 2011 - 07:12:36 EST


On Tue, 2011-11-29 at 18:18 +0100, Oleg Nesterov wrote:
> On 11/28, Peter Zijlstra wrote:
> >
> > On Mon, 2011-11-28 at 20:06 +0100, Oleg Nesterov wrote:
> > > +void uprobe_switch_to(struct task_struct *curr)
> > > +{
> > > + struct uprobe_task *utask = curr->utask;
> > > + struct pt_regs *regs = task_pt_regs(curr);
> > > +
> > > + if (!utask || utask->state != UTASK_SSTEP)
> > > + return;
> > > +
> > > + if (!(regs->flags & X86_EFLAGS_TF))
> > > + return;
> > > +
> > > + set_xol_ip(regs);
> > > +}
> >
> > > void __weak set_xol_ip(struct pt_regs *regs)
> > > {
> > > + int cpu = smp_processor_id();
> > > + struct uprobe_task *utask = current->utask;
> > > + struct uprobe *uprobe = utask->active_uprobe;
> > > +
> > > + memcpy(uprobe_xol_slots[cpu], uprobe->insn, MAX_UINSN_BYTES);
> > > +
> > > + utask->xol_vaddr = fix_to_virt(UPROBE_XOL_FIRST_PAGE)
> > > + + UPROBES_XOL_SLOT_BYTES * cpu;
> > > + set_instruction_pointer(regs, utask->xol_vaddr);
> > > }
> >
> > So uprobe_switch_to() will always reset the IP to the start of the slot?
> > That sounds wrong, things like the RIP relative stuff needs multiple
> > instructions.
>
> Hmm. Could you explain? Especially the "multiple instructions" part.
>
> In any case we should reset the IP to the start of the slot.
>
> But yes, I'm afraid this is too simple. Before this patches pre_xol()
> is called when we already know ->xol_vaddr. But afaics x86 doesn't use
> this info (post_xol() does). So this looks equally correct or wrong.
>
> But perhaps we need another arch-dependent hook which takes ->xol_vaddr
> into account instead of simple memcpy(), to handle the RIP relative
> case.
>
> Or I misunderstood?

Suppose you need multiple instructions to replace the one you patched
out, for example because the instruction was RIP relative (the effect
relied on the IP the instruction is at, eg. short jumps instead of
absolute jumps).

One way to translate these instructions is something like

push eax
mov eax, $previous_ip
$ins eax+offset
pop eax

Also, the thing Srikar mentioned is boosted probes, in that case you
forgo the whole single step thing and rewrite the probe as:

$ins
jmp $next_insn

Now in the former case you still single step so the context switch hook
can function as proposed (triggered off of TIF_SINGLESTEP). However if
you get preempted after the mov you want to continue with the $ins, not
restart at push. So uprobe_switch_to() will have to preserve the
relative offset within the slot.

On the second example there's no singlestepping left, so we need to
create a new TIF flag, when you first set up the probe you toggle that
flag and on the first context switch where the IP is outside of the slot
you clear it. But still you need to maintain relative offset within the
slot when you move it around.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/