Re: [PATCH] use unfair spinlock when running on hypervisor.

From: Avi Kivity
Date: Wed Jun 02 2010 - 05:00:57 EST


On 06/02/2010 11:50 AM, Andi Kleen wrote:
On Wed, Jun 02, 2010 at 05:51:14AM +0300, Avi Kivity wrote:
On 06/01/2010 08:27 PM, Andi Kleen wrote:
On Tue, Jun 01, 2010 at 07:52:28PM +0300, Avi Kivity wrote:

We are running everything on NUMA (since all modern machines are now NUMA).
At what scale do the issues become observable?

On Intel platforms it's visible starting with 4 sockets.

Can you recommend a benchmark that shows bad behaviour? I'll run it with
Pretty much anything with high lock contention.

Okay, we'll try to measure it here as soon as we can switch it into numa mode.

Do you have any idea how we can tackle both problems?
Apparently Xen has something, perhaps that can be leveraged
(but I haven't looked at their solution in detail)

Otherwise I would probably try to start with a adaptive
spinlock that at some point calls into the HV (or updates
shared memory?), like john cooper suggested. The tricky part here would
be to find the thresholds and fit that state into
paravirt ops and the standard spinlock_t.


There are two separate problems: the more general problem is that the hypervisor can put a vcpu to sleep while holding a lock, causing other vcpus to spin until the end of their time slice. This can only be addressed with hypervisor help. The second problem is that the extreme fairness of ticket locks causes lots of context switches if the hypervisor helps, and aggravates the first problem horribly if it doesn't (since now a vcpu will spin waiting for its ticket even if the lock is unlocked).

So yes, we'll need hypervisor assistance, but even with that we'll need to reduce ticket lock fairness (retaining global fairness but sacrificing some local fairness). I imagine that will be helpful for non-virt as well as local unfairness reduces bounciness.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/