Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation

From: David Howells
Date: Fri Dec 16 2005 - 07:00:26 EST


Nick Piggin <nickpiggin@xxxxxxxxxxxx> wrote:

> I was under the impression that with cmpxchg, you don't need the mutex lock.
> If you do then sure, cmpxchg doesn't buy you anything (even if the arch does
> natively support it).

Consider the slow path...

Imagine the mutex has three states: 0 (unset), 1 (held), 3 (contention).

MUTEX PROCESS A PROCESS B PROCESS C
====== ============== ============== ==============
1,B,-
1,B,- -->mutex_lock()
1,B,- cmpxchg(0,1) [failed]
1,B,- -->__mutex_lock()
1,B,- -->mutex_unlock()
1,B,- cmpxchg(1,0) [success]
0,-,- <--mutex_unlock()
0,-,- spin_lock
0,-,A cmpxchg(0,1) [success]
1,A,A spin_unlock
1,A,- <--__mutex_lock()
1,A,- <--mutex_lock()

Or:

MUTEX PROCESS A PROCESS B PROCESS C
====== ============== ============== ==============
1,B,-
1,B,- -->mutex_lock()
1,B,- cmpxchg(0,1) [failed]
1,B,- -->__mutex_lock()
1,B,- spin_lock
1,B,A cmpxchg(0,1) [failed]
1,B,A -->mutex_unlock()
1,B,A cmpxchg(1,0) [success]
0,-,A <--mutex_unlock()
0,-,A cmpxchg(1,3) [failed]
0,-,A cmpxchg(0,1) [success]
1,A,A spin_unlock
1,A,- <--__mutex_lock()
1,A,- <--mutex_lock()

Or:

MUTEX PROCESS A PROCESS B PROCESS C
====== ============== ============== ==============
1,B,-
1,B,- -->mutex_lock()
1,B,- cmpxchg(0,1) [failed]
1,B,- -->__mutex_lock()
1,B,- spin_lock
1,B,A cmpxchg(0,1) [failed]
1,B,A -->mutex_unlock()
1,B,A cmpxchg(1,0) [success]
0,-,A <--mutex_unlock()
0,-,A cmpxchg(1,3) [failed]
0,-,A -->mutex_lock()
0,-,A cmpxchg(0,1) [success]
1,C,A <--mutex_lock()
1,C,A cmpxchg(0,1) [failed]
1,C,A cmpxchg(1,3) [success]
3,A,A spin_unlock
3,A,- <--__mutex_lock()
3,A,- <--mutex_lock()

See how many cmpxchgs you may end up doing? Now imagine that cmpxchg() is
implemented as:

spin_lock_irqsave(&common_lock[N], flags);
actual = *state;
if (actual == old)
*state = new;
spin_unlock_irqrestore(&common_lock[N], flags);

Now my point about using LL/SC is that:

1,C,A cmpxchg(0,1) [failed]
1,C,A cmpxchg(1,3) [success]
3,C,A ...

Can be turned into:

1,C,A x = LL()
1,C,A x |= 2;
1,C,A SC(3) [success]
3,C,A ...

On x86 you could use:

1,C,A LOCK OR (2)
3,C,A ...

instead.

Now, contention isn't very likely, so using CMPXCHG _may_ be good enough _if_
you have it. But if you have to emulate it by using spinlocks, you're far
better off just wrapping the entire thing in spinlocks and not pretending use
atomic ops to access the counter; unless, of course, you have somthing that
can do a 1-bit XCHG or better...

MUTEX PROCESS A PROCESS B PROCESS C
====== ============== ============== ==============
0,-,- -->mutex_lock()
0,-,- xchg(1) == 0
1,B,- <--mutex_lock()
1,B,-
1,B,- -->mutex_lock()
1,B,- xchg(1) == 1
1,B,- -->__mutex_lock()
1,B,- -->mutex_unlock()
1,B,B spin_lock
1,B,B set(0)
0,-,B spin_unlock
0,-,- <--mutex_unlock()
0,-,- spin_lock
0,-,A xchg(1) == 0
1,A,A spin_unlock
1,A,- <--__mutex_lock()
1,A,- <--mutex_lock()

mutex_unlock() should get the spinlock here before modifying the count,
because if there's anything on the queue, it should wake up the first waiter
rather than clearing the count.

David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/