Re: [PATCH 0/3] 64-bit futexes: Intro

From: Nick Piggin
Date: Fri Jun 06 2008 - 07:53:18 EST


On Thu, Jun 05, 2008 at 08:37:19PM -0700, Linus Torvalds wrote:
>
>
> On Fri, 6 Jun 2008, Nick Piggin wrote:
> >
> > What you *could* maybe do, to slightly speed up the reader fastpath, at
> > the expense of the writer fastpath, is to also have the active writer add
> > 4 to the count too, so your unlock can start with a lock xadd -4, count
> > in order to get the write-intent on the cacheline straight up.
>
> Yes, nice idea. It avoids the possible unnecessary S->M transition, but
> the downside is that it effectively slows down the write unlock by making
> it do two atomic ops even for the fastpath. So if I were to _only_ care
> about the reader path, I think it would be a great idea, but as it is, the
> current non-contended write case is actually pretty close to optimal, and
> doing the unconditional xaddl on the unlock path would slow that one down.

Yeah, it is a case of a large slowdown for write for a small speedup
for read (pity the API doesn't have explicit read and write unlocks
-- were they too lazy to type the last bit, or did they expect people
to lose track of whether they had a read or write lock? :P)

Anyway, it's obviously a tradeoff you'd just have to carefully
benchmark in real situations.


> > I'd be more interested to know why this code can't be evolved into a full
> > rwlock implementation? This is a rather standard (though neat) looking rwlock
> > -- so my question is what can the patented 64-bit futex locks do that this
> > can't, or what can they do faster?
>
> Quite frankly - and this was my argument the whole time - I do not believe

> consider things like timeouts etc. Timeouts are "hard" to handle because
> they mean that you cannot use any kind of trivially incrementing "ticket
> locks" with sequence numbers (because we may have to just avoid a sequence
> if it times out), so the sequence number approach that we now use for
> kernel spinlocks was not an option. I didn't actually *write* the timeout
> versions, of course, but given the structure of the locks they really
> should be very straightforward.
>
> [ Half-way subtle thing: a writer that times out needs to be very careful
> that it doesn't lose a wakeup event, but futexes actually make that part
> pretty easy - since FUTEX_WAIT returns whether you got woken up or not,
> you can just decide to wake up the next write-waiter if you cannot get
> the lock immediately and have to exit due to a timeout. ]
>
> But I really haven't tested my rwlocks very exhaustively, and I did not
> verify that they actualyl scale with lots of CPU's, for example. I
> literally only have dual-core CPU's in use at home, right now, nothing
> fancier. Somebody with dual-socket quads would be a lot better off, and
> the more the merrier, of course.

Well... a single lock is only going to be so scalable. I don't see how
it could be done really significantly better? Maybe a small factor of
improvement if you were to concentrate on the contended case (but you
wouldn't want to do that anyway)



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/