Re: robust futex heap support patch

From: david singleton
Date: Tue Dec 06 2005 - 10:23:22 EST



On Dec 6, 2005, at 6:06 AM, Esben Nielsen wrote:

On Mon, 5 Dec 2005, david singleton wrote:


On Dec 5, 2005, at 2:30 PM, Esben Nielsen wrote:

I'm currently trying to close a race condition between futex_wait_robust
and futex_wake_robust that Dave Carlson is seeing on his SMP system.

The scenario is as follows:

Thread A locks an pthread_mutex via the fast path and does not enter
the kernel.

Thread B tries to lock the lock and sees it is already locked. Thread
B sets the
waiters flag in the lock and enters the kernel to lock the lock on
behalf of
thread A and then block on the mutex waiting for it's release.

Thread A unlocks the lock and sees the waiters flag set. Thread A gets
to the futex_wake_robust before Thread B can get to futex_wait_robust.

Thread A sees that it does not own the lock in the kernel and was
returning EINVAL.

patch-2.6.14-rt21-rf8 was a preliminary patch for Dave Carlson to try
and get more information about
the race condition. ( rf8 and rf9 are still returning EAGAIN from
thread B trying to
do the futex_wait_robust and the library should be retrying with
EAGAIN, but it currently isn't).


*nod*
I just pointed out that you can't make thread A loop the way you do.

What I would do was to do the user space flag checks while having the
raw spinlock of the rt_mutex. That way you are sure that stuff in the
kernel is serialized. But I don't know what to do exactly to do in
there....

Yes, and you are correct. That patch was an attempt at seeing how far the wake code
had to go to let the waiters in on an SMP machine.


When I get the race condition closed I'll post the patch and notify
everyone on lkml
and the robustmutexes mailing lists.

Where can I sign up to that mailing list? I would like to follow the
development, although I don't have much time to contibute.

The mail lists you cc'd will do it, robustmutexes@xxxxxxxxxxxxxx and linux-kernel@xxxxxxxxxxxxxxxx

David


Esben


David



Hi,
I got a little time to look at your current patch (2.6.14-rt21-rf8).
I noticed a problem in futex_wake_robust(). You have a "goto retry" to
solve the following situation:

Task A Task B
takes futex in userspace
Tries to take mutex and sets the
waiting bit in user space
releases futex, notices task B
calls kernel and enters
futex_wake_robust()

retry:
if not owner in rt_mutex
goto retry;
Calls kernel
Makes A owner in rt_mutex
blocks
Leaves retry-loop and
completes the futex wake
operation as normally.


However, if Task A is RT on a UP machine it will go on in the retry
loop
forever. Task B will never get the CPU to complete it's kernel-call.

You have probably by manipulating the userspace flag from within the
rt_mutex code :-(

Esben


On Fri, 25 Nov 2005, david singleton wrote:

There is a new patch, patch-2.6.14-rt15-rf1, that adds support for
robust and priority inheriting
pthread_mutexes on the 'heap'.

The previous patches only supported either file based pthread_mutexes
or mmapped anonymous memory based pthread_mutexes. This patch allows
pthread_mutexes
to be 'malloc'ed while using the PTHREAD_MUTEX_ROBUST_NP attribute
or PTHREAD_PRIO_INHERIT attribute.

The patch can be found at:

http://source.mvista.com/~dsingleton

David

-
To unsubscribe from this list: send the line "unsubscribe
linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/





-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/