Re: [PATCH 0/5 - V2] locks: avoid thundering-herd wake-ups

From: Jeff Layton
Date: Sat Aug 11 2018 - 07:56:30 EST


On Fri, 2018-08-10 at 11:47 -0400, J. Bruce Fields wrote:
> On Fri, Aug 10, 2018 at 01:17:14PM +1000, NeilBrown wrote:
> > On Thu, Aug 09 2018, J. Bruce Fields wrote:
> >
> > > On Fri, Aug 10, 2018 at 11:50:58AM +1000, NeilBrown wrote:
> > > > You're good at this game!
> > >
> > > Everybody's got to have a hobby, mine is pathological posix locking
> > > cases....
> > >
> > > > So, because a locker with the same "owner" gets a free pass, you can
> > > > *never* say that any lock which conflicts with A also conflicts with B,
> > > > as a lock with the same owner as B will never conflict with B, even
> > > > though it conflicts with A.
> > > >
> > > > I think there is still value in having the tree, but when a waiter is
> > > > attached under a new blocker, we need to walk the whole tree beneath the
> > > > waiter and detach/wake anything that is not blocked by the new blocker.
> > >
> > > If you're walking the whole tree every time then it might as well be a
> > > flat list, I think?
> >
> > The advantage of a tree is that it keeps over-lapping locks closer
> > together.
> > For it to make a difference you would need a load where lots of threads
> > were locking several different small ranges, and other threads were
> > locking large ranges that cover all the small ranges.
>
> OK, I'm not sure I understand, but I'll give another look at the next
> version....
>
> > I doubt this is common, but it doesn't seem as strange as other things
> > I've seen in the wild.
> > The other advantage, of course, is that I've already written the code,
> > and I like it.
> >
> > Maybe I'll do a simple-list version, then a patch to convert that to the
> > clever-tree version, and we can then have something concrete to assess.
>
> That might help, thanks.
>

FWIW, I did a bit of testing with lockperf tests that I had written on
an earlier rework of this code:

https://git.samba.org/jlayton/linux.git/?p=jlayton/lockperf.git;a=summary


The posix01 and flock01 tests in there show about a 10x speedup with
this set in place.

I think something closer to Neil's design will end up being what we want
here. Consider the relatively common case where you have a whole-file
POSIX write lock held with a bunch of different waiters blocked on it
(all whole file write locks with different owners):

With Neil's patches, you will just wake up a single waiter when the
blocked lock is released, as they would all be in a long chain of
waiters.

If you keep all the locks in a single list, you'll either have to:

a) wake up all the waiters on the list when the lock comes free: no lock
is held at that point so none of them will conflict.

...or...

b) keep track of what waiters have already been awoken, and compare any
further candidate for waking against the current set of held locks and
any lock requests by waiters that you just woke.

b seems more expensive as you have to walk over a larger set of locks
on every change.
--
Jeff Layton <jlayton@xxxxxxxxxx>