Re: [RFC/PATCH] FUSYN Realtime & Robust mutexes for Linux try 2

From: Jamie Lokier
Date: Wed Dec 03 2003 - 23:15:26 EST


Here's my first thoughts, on reading Documentation/fusyn.txt.

- Everything here can be implemented on top of regular futex, but
that doesn't mean the implementation will be efficient or robust.
(This is because, in general, any kind of futex/fuqueue/whatever
implementation can be moved from kernel space to userspace, using
the real kernel futexes to simulate the waitqueues and spinlocks
that are called in futex.c).

There are some good ideas in here, for people who need the features:

- Priority inheritence is ok _when_ you want it.

Sometimes if task A with high priority wants a resource which is
locked by task B with lower priority, that should be an error
condition: it can be dangerous to promote the priority of task B,
if task B is not safe to run at a high priority.

- The data structures and priority callbacks which are used to
implement priority inheritance, protection and highest-priority
wakeup are fine.

But highest-priority wakeup (at least) shouldn't be just for
fuqueues: it should be implemented at a lower level, directly in
the kernel's waitqueues. Meaning: wake_up() should wake the
highest priority task, not the first one queued, if that is
appropriate for the queue or waker.

- I like the method of detecting dead mutex holders, and the ability
to handle recovery of inconsistent data.

- Is there a method for passing the locked state to another task?
Compare-and-swap old-pid -> new-pid works when there isn't any
contention, but a kernel call is needed in any of the
kernel-controlled states.

- It's very unpleasant that rwlocks enter the kernel when there is
more than one reader. Hashed rwlocks can be implemented in
userspace to reduce this (readers take one rwlock from a hashed
set; writers take them all in order), but it isn't wonderful.

- For architectures which can't do compare-and-swap, a system call
which does the equivalent (i.e. disables preemption, does
compare-and-swap, enables preemption again) would be quite useful.
Not for maximum performance, but to allow architecture-independent
locking strategies to be written portably.

- Similarly, it would be good for the VFULOCK_ flags to depend on
only 31 bits of the word, i.e. ignoring one of them. Then
architectures which don't have compare-and-swap but which do have
test-and-set-bit can use that.

- Does the owner field have to be the pid or can a fulock use any
kind of key value, as long as it isn't one of the reserved values,
that's convenient to the application.

- It's nice that you use separate syscalls for each operation.
Futex should do this too; multiplexing syscalls like the one in
futex.c should be banned :)

- It's a huge patch. A nice thing about futex.c is that it's
relatively small (your patch is 9 times larger). The original
futex design was more complicated, and written specifically for
mutexes. Then it was made simpler and I think smaller at the same
time. Perhaps putting some of the RT priority capabilities
directly into kernel waitqueues would help with this.

-- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/