Re: MM deadlock [was: Re: arca-vm-8...]

Tom Eastep (eastep@loc1.tandem.com)
Tue, 12 Jan 1999 04:22:40 +0000


Linus Torvalds wrote:
>
> On Mon, 11 Jan 1999, MOLNAR Ingo wrote:
> >
> > ps. not that it counts too much, but my embarrasing email brought me to
> > really closely check all the semaphore code again, and it seems to be
> > all really cool and accurate. I've 'unfolded' the code into pseudocode
> > by hand:
>
> The not cool part about my implementation of recursive semaphores is:
>
> CPU #1 CPU#2
>
> down()
> ..
> up()
>
> "owner" is P#1, count=1
>
> down() down()
> lock ; decl succeeds
>
> lock ; decl fails
> goto slow part
> slow part sees stale owner
> returns success
>
> "owner" is P#2
>
> BOOM! - both CPU's are now inside the critical region.
>
> The problem is that "owner" can be stale, and can be updated outside the
> semaphore spinlock by a successful down() - which means that there are no
> synchronizing primitives there to guarantee that the slow part sees the
> new owner.
>
> Now, setting the new owner is done in the very next instruction after
> getting the semaphore, so CPU#1 has to be really fast, and CPU#2 has to be
> really slow for the above to happen. But it's still possible, and CPU#2
> taking an interrupt could cause that to happen.
>
> And I don't see any way of getting rid of it without another spinlock. I
> _could_ possibly do it with something like
>
> up(sem) {
> if (!sem.count)
> sem.owner = 0;
> old_up();
> }
>
> but I can't convince myself that that always works either.
>
> Linus
>

The way I've implemented this in the past is to make the recursive
semaphore a 3-tuple:

( old_semaphore , owner, owner_count ) where owner and owner_count
are initially zero.

down() {
if ( owner == current ) {
owner_count++;
return;
}
old_down();
owner = current;
owner_count = 1;
}

up() {
if ( ! --owner_count ) {
owner = 0;
old_up();
}

-Tom

-- 
Tom Eastep
Compaq Computer Corporation
Enterprise Computing Group
Tandem Division
tom.eastep@compaq.com

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/