Re: Why 'wait queues' and not 'channels'

Malcolm Beattie (mbeattie@sable.ox.ac.uk)
Mon, 17 May 1999 14:12:41 +0100 (BST)


David S. Miller writes:
> Date: Sat, 15 May 1999 20:58:17 +0200 (MET DST)
> From: Gerard Roudier <groudier@club-internet.fr>
>
> Linux waits on 'wait queues' but other UNIXes sleep on 'channels'.
>
> What is better with wait queues?
>
> Believe me when I say that we don't want the traditional sleeping
> methodology of "UNIX" in any way shape or form.
>
> As far as I know, only Plan9 and Linux get the sleeping method correct
> (someone please correct me if some other OS does things this way as
> well).
>
> Essentially the crucial difference is, under Linux we:
>
> add_to_wait_queue
> while(1) {
> check event, break if event happened
> break if signal arrived
> schedule();
> }
> remove_from_wait_queue
>
> On first sight, you may think nothing interesting is happening here.
>
> Think harder, the issue is that we eliminate completely any races
> between the adding to wait queue and test of the event. The task
> himself controls completely his existence on the wait queue.
>
> Other systems have a more complex issue about avoiding the race
> between the test and the sleep (especially on SMP) because they have
> only a "add to wait queue and schedule()" singular interface.
>
> I suggest not changing the wait queue architecture we have, it's done
> right and cleanly.

Digital UNIX has a two-step approach like Linux although,
unsurprisingly, it's more complex partly because it provides a
mixture of Mach-ish and BSD-ish APIs. Unlike Linux, though, it
does keep a hash of wait queues. The two steps are
assert_wait(event, interruptible);
...
thread_block();
The assert_wait() says the thread is about to block and puts it on
the wait queue. thread_block() is the equivalent of Linux
schedule(). I suppose in some ways assert_wait() (in its simplest
guise) is more like
current->state = TASK_INTERRUPTIBLE;

On top of that, though, Digital UNIX then has a whole host of
additions: there are plenty of macros around the basic primitives
which let you do things like pass in a simple or complex lock to be
locked/unlocked where necessary, you can specify timeouts, you can
send a result along with a wakeup to say why you woke the target up
(thread_wakeup_with_result), you can do thread_wakeup to wakeup all
threads or thread_wakeup_one to wake one, you can even clear the
wait condition of another thread explicitly (plus all this has to
cope with swapped out task structures). There are macros to support
the BSDish API with tsleep, mpsleep as well as the Machish ones.

In brief, the two-step assert/block is like Linux but the complex
additional APIs certainly isn't.

--Malcolm

-- 
Malcolm Beattie <mbeattie@sable.ox.ac.uk>
Unix Systems Programmer
Oxford University Computing Services

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/