Re: Q: perf_event && event->owner

From: Oleg Nesterov
Date: Wed Nov 10 2010 - 10:50:59 EST


On 11/10, Peter Zijlstra wrote:
>
> On Tue, 2010-11-09 at 19:57 +0100, Oleg Nesterov wrote:
> > Either sys_perf_open() should do get_task_struct() like we currently
> > do, or perf_event_exit_task() should clear event->owner and then
> > perf_release() should do something like
> >
> > rcu_read_lock();
> > owner = event->owner;
> > if (owner)
> > get_task_struct(owner);
> > rcu_read_unlock();
> >
> > if (owner) {
> > mutex_lock(&event->owner->perf_event_mutex);
> > list_del_init(&event->owner_entry);
> > mutex_unlock(&event->owner->perf_event_mutex);
> > put_task_struct(owner);
> > }
> >
> > Probably this can be simplified...
>
> I think that's still racy, suppose we do:
>
> void perf_event_exit_task(struct task_struct *child)
> {
> struct perf_event *event, *tmp;
> int ctxn;
>
> mutex_lock(&child->perf_event_mutex);
> list_for_each_entry_safe(event, tmp, &child->perf_event_list,
> owner_entry) {
> event->owner = NULL;
> list_del_init(&event->owner_entry);
> }
> mutex_unlock(&child->perf_event_mutex);
>
> for_each_task_context_nr(ctxn)
> perf_event_exit_task_context(child, ctxn);
> }
>
>
> and the close() races with an exit, then couldn't we observe
> event->owner after the last put_task_struct()?

I think no. Note that we do not just free task_struct via rcu callback.
Instead, delayed_put_task_struct() drops the (may be) last reference.

But the code is racy, yes. owner != NULL case is fine. But
perf_release() can see event->owner == NULL before list_del() was
completed. perf_event_exit_task() needs wmb() in between, I think.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/