Re: [PATCH] epoll: try to be a _bit_ better about file lifetimes

From: Linus Torvalds
Date: Fri May 03 2024 - 17:34:13 EST


On Fri, 3 May 2024 at 14:24, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
>
> Can we get to ep_item_poll(epi, ...) after eventpoll_release_file()
> got past __ep_remove()? Because if we can, we have a worse problem -
> epi freed under us.

Look at the hack in __ep_remove(): if it is concurrent with
eventpoll_release_file(), it will hit this code

spin_lock(&file->f_lock);
if (epi->dying && !force) {
spin_unlock(&file->f_lock);
return false;
}

and not free the epi.

But as far as I can tell, almost nothing else cares about the f_lock
and dying logic.

And in fact, I don't think doing

spin_lock(&file->f_lock);

is even valid in the places that look up file through "epi->ffd.file",
because the lock itself is inside the thing that you can't trust until
you've taken the lock...

So I agree with Kees about the use of "atomic_dec_not_zero()" kind of
logic - but it also needs to be in an RCU-readlocked region, I think.

I wish epoll() just took the damn file ref itself. But since it relies
on the file refcount to release the data structure, that obviously
can't work.

Linus