Re: Spinlocks: Factor our GENERIC_LOCKBREAK in order to avoid spinwith irqs disable

From: Jeremy Fitzhardinge
Date: Thu Jun 26 2008 - 11:50:10 EST

Next message: Alan Stern: "Re: 2.6.26-rc1 regression since 2.6.25 - problem in 2.6.26-rc8 again"
Previous message: Arnd Bergmann: "Re: [PATCH 58/60] microblaze_v4: sys_microblaze.c"
In reply to: Peter Zijlstra: "Re: Spinlocks: Factor our GENERIC_LOCKBREAK in order to avoid spinwith irqs disable"
Next in thread: Petr Tesarik: "Re: Spinlocks: Factor our GENERIC_LOCKBREAK in order to avoid spinwith irqs disable"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Peter Zijlstra wrote:

Paravirt spinlocks sounds like a good idea anyway, that way you can make
them scheduling locks (from the host's POV) when the lock owner (vcpu)
isn't running.

Burning time spinning on !running vcpus seems like a waste to me.

In theory. But in practice Linux locks are so low-contention that not much time seems to get wasted. I've been doing experiments with spin-a-while-then-block locks, but they never got to the -then-block part in my test. The burning cycles spinning only gets expensive if the lock-holder vcpu gets preempted, and there's other cpus spinning on that lock; but if locks are held only briefly, then there's little chance being preempted while holding the lock.

At least that's at the scale I've been testing, with only two cores. I expect things look different with 8 or 16 cores and similarly scaled guests.

As for the scheduler solving the unfairness that ticket locks solve,

No, I never said scheduler would the problem, merely mitigate it.

that cannot be done. The ticket lock solves intra-cpu fairness for a
resource other than time. The cpu scheduler only cares about fairness in
time, and its intra-cpu fairness is on a larger scale than most spinlock
hold times - so even if time and the locked resource would overlap it
wouldn't work.

The simple scenario is running N tasks on N cpus that all pound the same
lock, cache issues will make it unlikely the lock would migrate away
from whatever cpu its on, essentially starving all the other N-1 cpus.

Yep. But in practice, the scheduler will steal the real cpu from under the vcpu dominating the lock and upset the pathalogical pattern. I'm not saying its ideal, but the starvation case that ticketlocks solve is pretty rare in the large scheme of things.

Also, ticket locks don't help either, if the lock is always transitioning between locked->unlocked->locked on all cpus. It only helps in the case of one cpu doing rapid lock->unlock transitions while others wait on the lock.

Ticket locks solve that exact issue, all the scheduler can do is ensure
they're all spending an equal amount of time on the cpu, whether that is
spinning for lock acquisition or getting actual work done is beyond its
scope.

Yes. But the problem with ticket locks is that they dictate a scheduling order, and if you fail to schedule in that order vast amounts of time are wasted. You can get into this state:

1. vcpu A takes a lock
2. vcpu A is preempted, effectively making a 5us lock be held for 30ms
3. vcpus E,D,C,B try to take the lock in that order
4. they all spin, wasting time. bad, but no worse than the old lock
algorithm
5. vcpu A eventually runs again and releases the lock
6. vcpu B runs, spinning until preempted
7. vcpu C runs, spinning until preempted
8. vcpu D runs, spinning until preempted
9. vcpu E runs, and takes the lock and releases it
10. (repeat spinning on B,C,D until D gets the lock)
11. (repeat spinning on B,C until C gets the lock)
12. B finally gets the lock

Steps 6-12 are all caused by ticket locks, and the situation is exacerbated by vcpus F-Z trying to get the lock in the meantime while its all tangled up handing out tickets in the right order.

The problem is that the old lock-byte locks made no fairness guarantees, and interacted badly with the hardware causing severe starvation in some cases. Ticket locks are too fair, and absolutely dictate the order in which the lock is taken. Really, all that's needed is the weaker assertion that "when I release the lock, any current spinner should get the lock".

J

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Alan Stern: "Re: 2.6.26-rc1 regression since 2.6.25 - problem in 2.6.26-rc8 again"
Previous message: Arnd Bergmann: "Re: [PATCH 58/60] microblaze_v4: sys_microblaze.c"
In reply to: Peter Zijlstra: "Re: Spinlocks: Factor our GENERIC_LOCKBREAK in order to avoid spinwith irqs disable"
Next in thread: Petr Tesarik: "Re: Spinlocks: Factor our GENERIC_LOCKBREAK in order to avoid spinwith irqs disable"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]