Re: [PATCH RFC] io_uring: make signalfd work with io_uring (and aio) POLL

From: Jens Axboe
Date: Thu Nov 14 2019 - 10:09:09 EST


On 11/14/19 7:12 AM, Rasmus Villemoes wrote:
> On 14/11/2019 14.46, Jann Horn wrote:
>> On Thu, Nov 14, 2019 at 10:20 AM Rasmus Villemoes
>> <linux@xxxxxxxxxxxxxxxxxx> wrote:
>>> On 14/11/2019 05.49, Jens Axboe wrote:
>>>> On 11/13/19 9:31 PM, Jens Axboe wrote:
>>>>> This is a case of "I don't really know what I'm doing, but this works
>>>>> for me". Caveat emptor, but I'd love some input on this.
>>>>>
>>>>> I got a bug report that using the poll command with signalfd doesn't
>>>>> work for io_uring. The reporter also noted that it doesn't work with the
>>>>> aio poll implementation either. So I took a look at it.
>>>>>
>>>>> What happens is that the original task issues the poll request, we call
>>>>> ->poll() (which ends up with signalfd for this fd), and find that
>>>>> nothing is pending. Then we wait, and the poll is passed to async
>>>>> context. When the requested signal comes in, that worker is woken up,
>>>>> and proceeds to call ->poll() again, and signalfd unsurprisingly finds
>>>>> no signals pending, since it's the async worker calling it.
>>>>>
>>>>> That's obviously no good. The below allows you to pass in the task in
>>>>> the poll_table, and it does the right thing for me, signal is delivered
>>>>> and the correct mask is checked in signalfd_poll().
>>>>>
>>>>> Similar patch for aio would be trivial, of course.
>>>>
>>>> From the probably-less-nasty category, Jann Horn helpfully pointed out
>>>> that it'd be easier if signalfd just looked at the task that originally
>>>> created the fd instead. That looks like the below, and works equally
>>>> well for the test case at hand.
>>>
>>> Eh, how should that work? If I create a signalfd() and fork(), the
>>> child's signalfd should only be concerned with signals sent to the
>>> child. Not to mention what happens after the parent dies and the child
>>> polls its fd.
>>>
>>> Or am I completely confused?
>>
>> I think the child should not be getting signals for the child when
>> it's reading from the parent's signalfd. read() and write() aren't
>> supposed to look at properties of `current`.
>
> That may be, but this has always been the semantics of signalfd(), quite
> clearly documented in 'man signalfd'.
>
>> Of course, if someone does rely on the current (silly) semantics, this
>> might break stuff.
>
> That, and Jens' patch only seemed to change the poll callback, so the
> child (or whoever else got a hand on that signalfd) would wait for the
> parent to get a signal, but then a subsequent read would attempt to
> dequeue from the child itself.
>
> So, I can't really think of anybody that might be relying on inheriting
> a signalfd instead of just setting it up in the child, but changing the
> semantics of it now seems rather dangerous. Also, I _can_ imagine
> threads in a process sharing a signalfd (initial thread sets it up and
> blocks the signals, all threads subsequently use that same fd), and for
> that case it would be wrong for one thread to dequeue signals directed
> at the initial thread. Plus the lifetime problems.

What if we just made it specific SFD_CLOEXEC? I don't want to break
existing applications, even if the use case is nonsensical, but it is
important to allow signalfd to be properly used with use cases that are
already in the kernel (aio with IOCB_CMD_POLL, io_uring with
IORING_OP_POLL_ADD). Alternatively, if need be, we could add a specific
SFD_ flag for this. Might also help with applications knowing if this
will work with io_uring/aio at all.

--
Jens Axboe