Re: [RFC v7] wait: prevent exclusive waiter starvation

From: Chris Mason
Date: Thu Jan 29 2009 - 09:38:28 EST


On Thu, 2009-01-29 at 01:11 -0800, Andrew Morton wrote:
> On Thu, 29 Jan 2009 09:31:08 +0100 Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
>
> > On 01/28, Andrew Morton wrote:
> > >
> > > On Thu, 29 Jan 2009 05:42:27 +0100 Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
> > >
> > > > On 01/28, Johannes Weiner wrote:
> > > > >
> > > > > Add abort_exclusive_wait() which removes the process' wait descriptor
> > > > > from the waitqueue, iff still queued, or wakes up the next waiter
> > > > > otherwise. It does so under the waitqueue lock. Racing with a wake
> > > > > up means the aborting process is either already woken (removed from
> > > > > the queue) and will wake up the next waiter, or it will remove itself
> > > > > from the queue and the concurrent wake up will apply to the next
> > > > > waiter after it.
> > > > >
> > > > > Use abort_exclusive_wait() in __wait_event_interruptible_exclusive()
> > > > > and __wait_on_bit_lock() when they were interrupted by other means
> > > > > than a wake up through the queue.
> > > >
> > > > Imho, this all is right, and this patch should replace
> > > > lock_page_killable-avoid-lost-wakeups.patch (except for stable tree).
> > >
> > > I dropped lock_page_killable-avoid-lost-wakeups.patch a while ago.
> > >
> > > So I think we're saying that
> > > lock_page_killable-avoid-lost-wakeups.patch actually did fix the bug?
> >
> > I think yes,
> >

Our test case that was able to reliably trigger the bug was fixed by
lock_page_killable-avoid-lost-wakeups.patch.

I'll ask them to test v7 as well. The run takes about a day, so
confirmation will take a bit.

-chris


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/