Re: [PATCH 0/4] locks: avoid thundering-herd wake-ups

From: Jeff Layton
Date: Thu Aug 09 2018 - 10:49:20 EST


On Thu, 2018-08-09 at 09:00 -0400, J. Bruce Fields wrote:
> On Wed, Aug 08, 2018 at 06:50:06PM -0400, Jeff Layton wrote:
> > That seems like a legit problem.
> >
> > One possible fix might be to have the waiter on (1,2) walk down the
> > entire subtree and wake up any waiter that is waiting on a lock that
> > doesn't conflict with the lock on which it's waiting.
> >
> > So, before the task waiting on 1,2 goes back to sleep to wait on 2,2, it
> > could walk down its entire fl_blocked subtree and wake up anything
> > waiting on a lock that doesn't conflict with (2,2).
> >
> > That's potentially an expensive operation, but:
> >
> > a) the task is going back to sleep anyway, so letting it do a little
> > extra work before that should be no big deal
>
> I don't understand why cpu used by a process going to sleep is cheaper
> than cpu used in any other situation.
>

It's not any cheaper in that sense. It's just that this task is not
slated to be doing any useful work anyway as it's going to sleep, so we
aren't delaying any "real work" by this task by having it do this
before returning to userland. It's already scheduled and holds the
appropriate lock.

The alternative would be to do this in the context of a different task,
but that means extra context switching and spinlocking, etc.

> > b) it's probably still cheaper than waking up the whole herd
>
> Yeah, I'd like to understand this.
>
> I feel like Neil's addressing two different performance costs:
>
> - the cost of waking up all the waiters
> - the cost of walking the list of waiters
>
> Are they equally important?
>
> If we only cared about the former, and only in simple cases, we could
> walk the entire list and skip waking up only the locks that conflict
> with the first one we wake. We wouldn't need the tree.

--
Jeff Layton <jlayton@xxxxxxxxxx>