Re: [patch 00/15] Generic Mutex Subsystem

From: Ingo Molnar
Date: Mon Dec 19 2005 - 15:11:09 EST



* Benjamin LaHaise <bcrl@xxxxxxxxx> wrote:

> > [ Oh. I'm looking at the semaphore code, and I realize that we have a
> > "wake_up(&sem->wait)" in the __down() path because we had some race long
> > ago that we fixed by band-aiding over it. Which means that we wake up
> > sleepers that shouldn't be woken up. THAT may well be part of the
> > performance problem.. The semaphores are really meant to wake up just
> > one at a time, but because of that race hack they'll wake up _two_ at a
> > time - once by up(), once by down().
> >
> > That also destroys the fairness. Does anybody remember why it's that
> > way? ]
>
> History? I think that code is very close to what was done in the
> pre-SMP version of semaphores. It is certainly possible to get rid of
> the separate sleepers -- parisc seems to have such an implementation.
> It updates sem->count in the wakeup path of __down().

i think we also need to look at the larger picture. If this really is a
bug that hid for years, it shows that the semaphore code is too complex
to be properly reviewed and improved. Hence even assuming that the mutex
code does not bring direct code advantages (which i'm disputing :-), the
mutex code is far simpler and thus easier to improve. We humans have a
given number of neurons, which form a hard limit :) In fact it's the
mutex code that made it apparent that there's something wrong with
semaphores.

we saw that with the genirq code, with the spinlock code, with the
preempt code. Consolidation did not add anything drastiically new, but
code consolidation _did_ make things more hackable, and improved the end
result far more than a splintered set of implementations would have
looked like.

Just look at the semaphore implementations of various architectures,
it's a quite colorful and inconsistent mix. Can you imagine adding
deadlock debugging to each of them?

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/