Re: [PATCH] uprobes: Use synchronize_rcu() not synchronize_sched()

From: Oleg Nesterov
Date: Fri Aug 10 2018 - 09:36:14 EST


On 08/10, Steven Rostedt wrote:
>
> On Fri, 10 Aug 2018 13:35:49 +0200
> Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
>
> > On 08/09, Steven Rostedt wrote:
> > >
> > > --- a/kernel/trace/trace_uprobe.c
> > > +++ b/kernel/trace/trace_uprobe.c
> > > @@ -952,7 +952,7 @@ probe_event_disable(struct trace_uprobe *tu, struct trace_event_file *file)
> > >
> > > list_del_rcu(&link->list);
> > > /* synchronize with u{,ret}probe_trace_func */
> > > - synchronize_sched();
> > > + synchronize_rcu();
> >
> > Can't we change uprobe_trace_func() and uretprobe_trace_func() to use
> > rcu_read_lock_sched() instead? It is more cheap.
>
> Is it? rcu_read_lock_sched() is a preempt_disable(),

which is just raw_cpu_inc()

> where
> rcu_read_lock() may just be a task counter increment.

and __rcu_read_unlock() is more heavy.

OK, I agree, this doesn't really matter.

> > Hmm. probe_event_enable() does list_del + kfree on failure, this doesn't
> > look right... Not only because kfree() can race with list_for_each_entry_rcu(),
> > we should not put the 1st link on list until uprobe_buffer_enable().
> >
> > Does the patch below make sense or I am confused?
>
> I guess the question is, if it isn't enabled, are there any users or
> even past users still running.

Note that uprobe_register() is not "atomic".

To simplify, suppose we have 2 tasks T1 and T2 running the probed binary.
So we are going to do install_breakpoint(T1->mm) + install_breakpoint(T2->mm).
If the 2nd install_breakpoint() fails for any reason, _register() will do
remove_breakpoint(T1->mm) and return the error.

However, T1 can hit this bp right after install_breakpoint(T1->mm), so it
can call uprobe_trace_func() before list_del(&link->list).

OK, even if I am right this is mostly theoretical.

Oleg.