Re: get_file() unsafe under epoll (was Re: [syzbot] [fs?] [io-uring?] general protection fault in __ep_remove)

From: Jens Axboe
Date: Fri May 03 2024 - 15:35:24 EST


On 5/3/24 1:22 PM, Kees Cook wrote:
> On Fri, May 03, 2024 at 12:49:11PM -0600, Jens Axboe wrote:
>> On 5/3/24 12:26 PM, Kees Cook wrote:
>>> Thanks for doing this analysis! I suspect at least a start of a fix
>>> would be this:
>>>
>>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
>>> index 8fe5aa67b167..15e8f74ee0f2 100644
>>> --- a/drivers/dma-buf/dma-buf.c
>>> +++ b/drivers/dma-buf/dma-buf.c
>>> @@ -267,9 +267,8 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll)
>>>
>>> if (events & EPOLLOUT) {
>>> /* Paired with fput in dma_buf_poll_cb */
>>> - get_file(dmabuf->file);
>>> -
>>> - if (!dma_buf_poll_add_cb(resv, true, dcb))
>>> + if (!atomic_long_inc_not_zero(&dmabuf->file) &&
>>> + !dma_buf_poll_add_cb(resv, true, dcb))
>>> /* No callback queued, wake up any other waiters */
>>
>> Don't think this is sane at all. I'm assuming you meant:
>>
>> atomic_long_inc_not_zero(&dmabuf->file->f_count);
>
> Oops, yes, sorry. I was typed from memory instead of copy/paste.

Figured :-)

>> but won't fly as you're not under RCU in the first place. And what
>> protects it from being long gone before you attempt this anyway? This is
>> sane way to attempt to fix it, it's completely opposite of what sane ref
>> handling should look like.
>>
>> Not sure what the best fix is here, seems like dma-buf should hold an
>> actual reference to the file upfront rather than just stash a pointer
>> and then later _hope_ that it can just grab a reference. That seems
>> pretty horrible, and the real source of the issue.
>
> AFAICT, epoll just doesn't hold any references at all. It depends,
> I think, on eventpoll_release() (really eventpoll_release_file())
> synchronizing with epoll_wait() (but I don't see how this happens, and
> the race seems to be against ep_item_poll() ...?)
>
> I'm really confused about how eventpoll manages the lifetime of polled
> fds.

epoll doesn't hold any references, and it's got some ugly callback to
deal with that. It's not ideal, nor pretty, but that's how it currently
works. See eventpoll_release() and how it's called. This means that
epoll itself is supposedly safe from the file going away, even though it
doesn't hold a reference to it.

Except that in this case, the file is already gone by the time
eventpoll_release() is called. Which presumably is some interaction with
the somewhat suspicious file reference management that dma-buf is doing.
But I didn't look into that much, outside of noting it looks a bit
suspect.

>>> Due to this issue I've proposed fixing get_file() to detect pathological states:
>>> https://lore.kernel.org/lkml/20240502222252.work.690-kees@xxxxxxxxxx/
>>
>> I don't think this would catch this case, as the memory could just be
>> garbage at this point.
>
> It catches it just fine! :) I tested it against the published PoC.

Sure it _may_ catch the issue, but the memory may also just be garbage
at that point. Not saying it's a useless addition, outside of the usual
gripes of making the hot path slower, just that it won't catch all
cases. Which I guess is fine and expected.

> And for cases where further allocations have progressed far enough to
> corrupt the freed struct file and render the check pointless, nothing
> different has happened than what happens today. At least now we have a
> window to catch the situation across the time frame before it is both
> reallocated _and_ the contents at the f_count offset gets changed to
> non-zero.

Right.

--
Jens Axboe