On Wed, 2010-03-31 at 19:38 -0400, Steven Rostedt wrote:On Wed, 2010-03-31 at 16:21 -0700, Darren Hart wrote:
o What type of lock hold times do we expect to benefit?0 (that's a zero) :-p
I haven't seen your patches but you are not doing a heuristic approach,
are you? That is, do not "spin" hoping the lock will suddenly become
free. I was against that for -rt and I would be against that for futex
too.
o How much contention is a good match for adaptive spinning?Again, I don't know how you implemented your adaptive spinners, but the
- this is related to the number of threads to run in the test
o How many spinners should be allowed?
I can share the kernel patches if people are interested, but they are really early, and I'm not sure they are of much value until I better understand the conditions where this is expected to be useful.
trick to it in -rt was that it would only spin while the owner of the
lock was actually running. If it was not running, it would sleep. No
point waiting for a sleeping task to release its lock.
Right. This was *critical* for the adaptive rtmutex. Note in the RT
patch, everybody spins as long as the current owner is on CPU.
FWIW, IIRC, Solaris has a heuristic approach where incoming tasks spin
for a period of time before going to sleep. (Cray UINCOS did the same)
Is this what you did? Because, IIRC, this only benefited spinlocks
converted to mutexes. It did not help with semaphores, because
semaphores could be held for a long time. Thus, it was good for short
held locks, but hurt performance on long held locks.
nod. The entire premise was based on the fact that we were converting
spinlocks, which by definition were short held locks. What I found
during early development was that the sleep/wakeup cycle was more
intrusive for RT than spinning.
IIRC, I measured something like 380k context switches/second prior to
the adaptive patches for a dbench test and we cut this down to somewhere
around 50k, with a corresponding increase in throughput. (I can't
remember specific numbers any more, it was a while ago... ;-)
When applied to semaphores, the benefit was in the noise range as I
recall..
(dbench was chosen due to the heavy contention on the dcache spinlock)
Best,
-PWM
If userspace is going to do this, I guess the blocked task would need to
go into kernel, and spin there (with preempt enabled) if the task is
still active and holding the lock.
Then the application would need to determine which to use. An adaptive
spinner for short held locks, and a normal futex for long held locks.
-- Steve