Re: [PATCH v2 0/2] signalfd/epoll fixes

From: Oleg Nesterov
Date: Sat Feb 25 2012 - 11:16:16 EST


On 02/24, Linus Torvalds wrote:
>
> On Fri, Feb 24, 2012 at 12:23 PM, Linus Torvalds
> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > I'm *really* conflicted, because I have this really strong feeling
> > that it's just papering over a symptom, and we damn well shouldn't be
> > doing that.

Heh. This is even documented in the changelog.

> I really think that what we really should do is allow
> > "poll()" to have a "poll_remove" callback (so each "add_poll_wait()"
> > will have a callback when it gets remove).

Yes. I also thought about this. In fact I think this probably makes
sense even ignoring epoll problems. Although I was thinking about
file_operations::poll_v2(file, poll_table, poll_table_entry, mode)
which could could replace the old ->poll() eventually. But perhaps
the explicit poll_rm() is better.

However, speaking about epoll/sigfd, imho this hides the problem too.

> +static void signalfd_poll_rm(struct file *file, wait_queue_head_t *p)
> +{
> + struct sighand_struct *sighand;
> +
> + sighand = container_of(p, struct sighand_struct, signalfd_wqh);
> + __cleanup_sighand(sighand);
> +}
> +
> static unsigned int signalfd_poll(struct file *file, poll_table *wait)
> {
> struct signalfd_ctx *ctx = file->private_data;
> unsigned int events = 0;
>
> - poll_wait(file, &current->sighand->signalfd_wqh, wait);
> + if (poll_wait(file, &current->sighand->signalfd_wqh, wait))
> + atomic_inc(&current->sighand->count);

Yes, this fixes use-after-free. But suppose the the application does:

sigfd = signalfd(...);
epoll_ctl(epollfd, EPOLL_CTL_ADD, sigfd);
execve();

de_thread() unshares ->sighand because of atomic_inc() above and
epollfd no longer works after exec, and the application can't know
this. (yes, currently we have the same problem with CLONE_SIGHAND'ed
apps, but still).

We can turn ->signalfd_wqh into the pointer to the refcountable
memory, or add another counter, but this is not nice.

And in any case. If the task exits, to me it looks simply pointless
to keep its ->sighand until __fput(). This only makes poll_rm() or
whatever safe, it can't report a signal. OK, OK, I am not arguing,
POLLFREE is equally ugly or even worse.

I _think_ that the right fix should move wait_queue_head_t from
sighand_struct to signalfd_ctx (file->private_data) somehow...

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/