Re: [rfc] "fair" rw spinlocks

From: Linus Torvalds
Date: Mon Nov 30 2009 - 12:06:23 EST




On Mon, 30 Nov 2009, Paul E. McKenney wrote:
>
> My suggestion would be to put the nesting counter in the task structure
> to avoid this problem.

It still doesn't end up being all that cheap. Now you'd need to disable
preemption in order to fix the race between the local counter and the real
lock.

That should be cheaper than cli/sti, but the downside is that now you need
that task struct pointer (both for the preemption disable and the
counter), so now you're adding some register pressure too. Of course,
maybe you don't even want to inline it anyway, in which case that doesn't
matter.

One advantage with your suggestion of using preemption is that (unlike irq
disables) you can keep the preemt counter over the whole lock, so you
don't need to re-do the preempt disable/enable in both read-lock and
read-unlock.

So you might end up with something like (UNTESTED!):

static void tasklist_write_lock(void)
{
spin_lock_irq(&tasklist_lock);
}

static void tasklist_write_unlock(void)
{
spin_unlock_irq(&tasklist_lock);
}

static void tasklist_read_lock(void)
{
preempt_disable();
if (!current->tasklist_count++)
spin_lock(&tasklist_lock);
}

static void tasklist_read_unlock(void)
{
if (!--current->tasklist_count)
spin_unlock(&tasklist_lock);
preempt_enable();
}

And the upside, of course, is that a spin_unlock() is cheaper than a
read_unlock() (no serializing atomic op), so while there is overhead,
there are also some advantages.. Maybe that atomic op advantage is enough
to offset the extra instructions.

And maybe we could use 'raw_spin_[un]lock()' in the above read-[un]lock
sequences, since tasklist_lock is pretty special, and since we do the
preempt disable by hand (no need to do it again in the spinlock code).
That looks like it might cut down the overhead of all of the above to
almost nothing for what is probably the common case today (ie preemption
enabled).

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/