Re: [PATCH RT v2] Fix a lockup in wait_for_completion() and friends

From: Sebastian Andrzej Siewior
Date: Tue May 14 2019 - 05:14:05 EST


On 2019-05-14 10:43:56 [+0200], Peter Zijlstra wrote:
> Now.. that will fix it, but I think it is also wrong.
>
> The problem being that it violates FIFO, something that might be more
> important on -RT than elsewhere.

Wouldn't -RT be more about waking the task with the highest priority
instead the one that waited the longest?

> The regular wait API seems confused/inconsistent when it uses
> autoremove_wake_function and default_wake_function, which doesn't help,
> but we can easily support this with swait -- the problematic thing is
> the custom wake functions, we musn't do that.
>
> (also, mingo went and renamed a whole bunch of wait_* crap and didn't do
> the same to swait_ so now its named all different :/)
>
> Something like the below perhaps.

This still violates FIFO because a task can do wait_for_completion(),
not enqueue itself on the list because it noticed a pending wake and
leave. The list order is preserved, we have that.
But this a completion list. We have probably multiple worker waiting for
something to do so all of those should be of equal priority, maybe one
for each core or so. So it shouldn't matter which one we wake up.

Corey, would it make any change which waiter is going to be woken up?

Sebastian