Re: net/sched: latent livelock in dev_deactivate_many() due to yield() usage

From: Peter Zijlstra
Date: Thu Apr 06 2017 - 06:27:18 EST


On Tue, Apr 04, 2017 at 10:25:19PM -0700, Cong Wang wrote:
> On Tue, Apr 4, 2017 at 8:20 PM, Mike Galbraith <efault@xxxxxx> wrote:
> > - while (some_qdisc_is_busy(dev))
> > - yield();
> > + swait_event_timeout(swait, !some_qdisc_is_busy(dev), 1);
> > }
>
> I don't see why this is an improvement even if I don't care about the
> hardcoded timeout for now... Why the scheduler can make a better
> decision with swait_event_timeout() than with cond_resched()?

cond_resched() might be a no-op.

and doing yield() will result in a priority inversion deadlock. Imagine
the task doing yield() being the top priority (fifo99) task in the
system. Then it will simply spin forever, not giving whatever task is
required to make your condition true time to run.