Re: PATCH? trace_remove_event_call() should fail if call is active

From: Oleg Nesterov
Date: Wed Jul 03 2013 - 15:22:42 EST


On 07/03, Steven Rostedt wrote:
>
> On Wed, 2013-07-03 at 19:54 +0200, Oleg Nesterov wrote:
> > On 07/03, Oleg Nesterov wrote:
> > >
> > > IOW. So far _I think_ we just need the additional changes in
> > > trace_remove_event_call() if it succeeds (with the patch I sent)
> > > to prevent the races like above, but I didn't even try to think
> > > about this problem.
> >
> > And I guess greatly underestimated the problem(s). When I look at
> > this code now, it seems that, say, event_enable_write() will use
> > the already freed ftrace_event_file in this case.
> >
> > Still I think this is another (although closely related) problem.
>
> Correct, and I think if we fix that problem, it will encapsulate fixing
> the kprobe race too.

I do not think so, but I can be easily wrong. Again, we shouldn't
destroy the event if there is a perf_event attached to this tp_event.
And we can't (afaics!) rely on TRACE_REG_UNREGISTER from event_remove()
paths, FTRACE_EVENT_FL_SOFT_MODE can nack it.

So I still think that we also need something like the patch I sent.
But please forget about this for the moment.

Can't we do something like below? Just in case, of course this change
is incomplete, just to explain what I mean... And of course I how no
idea if the change in debugfs is safe, I never looked into fs/debugfs
before. But, perhaps, somehow we can clear i_private under event_mutex
and kernel/trace can use file_inode() instead of filp->private_data ?

Oleg.


diff --git a/fs/debugfs/inode.c b/fs/debugfs/inode.c
index 4888cb3..c23d41e 100644
--- a/fs/debugfs/inode.c
+++ b/fs/debugfs/inode.c
@@ -475,6 +475,7 @@ static int __debugfs_remove(struct dentry *dentry, struct dentry *parent)
kfree(dentry->d_inode->i_private);
/* fall through */
default:
+ dentry->d_inode->i_private = NULL;
simple_unlink(parent->d_inode, dentry);
break;
}
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 27963e2..bdfd161 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -643,13 +643,10 @@ static ssize_t
event_enable_write(struct file *filp, const char __user *ubuf, size_t cnt,
loff_t *ppos)
{
- struct ftrace_event_file *file = filp->private_data;
+ struct ftrace_event_file *file;
unsigned long val;
int ret;

- if (!file)
- return -EINVAL;
-
ret = kstrtoul_from_user(ubuf, cnt, 10, &val);
if (ret)
return ret;
@@ -661,8 +658,11 @@ event_enable_write(struct file *filp, const char __user *ubuf, size_t cnt,
switch (val) {
case 0:
case 1:
+ ret = -EINVAL;
mutex_lock(&event_mutex);
- ret = ftrace_event_enable_disable(file, val);
+ file = file_inode(filp)->i_private;
+ if (file)
+ ret = ftrace_event_enable_disable(file, val);
mutex_unlock(&event_mutex);
break;


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/