Re: furwocks: Fast Userspace Read/Write Locks

From: Rusty Russell (rusty@rustcorp.com.au)
Date: Fri Mar 08 2002 - 01:27:06 EST


On 7 Mar 2002 12:17:52 -0800
"H. Peter Anvin" <hpa@zytor.com> wrote:

> Followup to: <20020307191043.9C5F33FE15@smtp.linux.ibm.com>
> By author: Hubertus Franke <frankeh@watson.ibm.com>
> In newsgroup: linux.dev.kernel
> >
> > Take a look at Rusty's futex-1.2, the code is not that different, however
> > if its all inlined it creates additional code on the critical path
> > and why do it if not necessary.
> >
> > In this case the futexes are the well tested path, the rest is a cludge on
> > top of it.
> >
>
> Perhaps someone could give a high-level description of how these
> "futexes" work?

Certainly! This is how Futexes IV (and Futexes V, ignoring the fairness
stuff that adds) works:

One or more userspace processes share address space, so they can both do
simple atomic operations on the same memory (hence the new PROT_SEM flag to
mmap/mprotect for architectures which need to know). They agree that "this
address contains an integer which we use as a mutex" (aka. struct futex).

The futex starts at 1 (available). down() looks like:
        if (atomic_dec_and_test(futex)) return 0; /* We got it! */
        else sys_futex(futex, DOWN); /* Go to kernel, wait. */

up() looks like:
        if (atomic_inc_positive(futex)) return 0; /* Noone waiting */
        else sys_futex(futex, UP); /* go to kernel, wake people */

Inside the kernel, we do what you'd expect. For sys_futex(futex, DOWN):
        Pin the page containing the futex, for convenience.
        Hash the kernel address of the futex to get a waitqueue.
        Add ourselves to the waitqueue.
        While !atomic_dec_and_test() (ie. futex still unavailable):
                sleep
                if signal pending, break;
        Unhash from waitqueue
        unpin page.
        return success or -EINTR.

For sys_futex(futex, UP):
        Pin page for convenience.
        Hash kernel address of futex to get the waitqueue.
        set futex to 1 (ie. available).
        Wake up the first one on the waitqueue waiting for this futex.
        unpin page

The only two twists to add are that we don't keep atomic_dec_and_test'ing
in the loop forever (if it's already negative we don't bother), so counter
doesn't wrap, and we don't use actual waitqueues because we share them, but
we want to know which futex each waiter is waiting on.

Pretty simple, no?
Rusty.

-- 
  Anyone who quotes me in their sig is an idiot. -- Rusty Russell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Mar 15 2002 - 22:00:07 EST