Re: [patch] speed up / fix the new generic semaphore code (fix AIM7 40% regression with 2.6.26-rc1)

From: Matthew Wilcox
Date: Thu May 08 2008 - 09:21:17 EST


On Thu, May 08, 2008 at 02:01:30PM +0200, Ingo Molnar wrote:
> Looking at the workload i found and fixed what i believe to be the real
> bug causing the AIM7 regression: it was inefficient wakeup / scheduling
> / locking behavior of the new generic semaphore code, causing suboptimal
> performance.

I did note that earlier downthread ... although to be fair, I thought of
it in terms of three tasks with the third task coming in and stealing
the second tasks's wakeup rather than the first task starving the second
by repeatedly locking/unlocking the semaphore.

> So if the old owner, even if just a few instructions later, does a
> down() [lock_kernel()] again, it will be blocked and will have to wait
> on the new owner to eventually be scheduled (possibly on another CPU)!
> Or if another other task gets to lock_kernel() sooner than the "new
> owner" scheduled, it will be blocked unnecessarily and for a very long
> time when there are 2000 tasks running.
>
> I.e. the implementation of the new semaphores code does wake-one and
> lock ownership in a very restrictive way - it does not allow
> opportunistic re-locking of the lock at all and keeps the scheduler from
> picking task order intelligently.

Fair is certainly the enemy of throughput (see also dbench arguments
passim). It may be that some semaphore users really do want fairness --
it seems pretty clear that we don't want fairness for the BKL.

--
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/