Re: Scheduler: Spinning until tasks are STOPPED

From: Nick Piggin
Date: Mon May 09 2005 - 01:08:17 EST


Yuly Finkelberg wrote:
Well it still sounds like the kernel is doing too much. For example,
why don't you just have a syscall (or char device) just to send out
the events, and do everything else (all the queueing and
synchronisation and signalling) in userspace?


True, it does :)
However, the operation requires that a consistent in-kernel state will
hold for the tasks. They all (except for the last one) do some work,
send a SIGSTOP to themselves, and return to usermode (handling the
STOP signal). The last task does not stop itself, but instead
verifies that all of the others are stopped before it returns.


Well you can do that all from userspace basically.
At least you should be able to do it as well as you can from
kernel (providing you have a syscall/device to retrieve events).


OK, for a simple example, instead of spinning on yield(), do a
down() on a locked mutex.
Then have maybe an `atomic_t nr_running` which is incremented for
each worker task running. When they are ready to stop, they can
do an atomic_dec_and_test of nr_running, and the last one can up()
the mutex. If you absolutely need to know when the process is
actually stopped, why?


I need to ensure that the internal state of the processes will not
change (unless of course some other signal gets delivered, which will
not be the case).


They won't change *much*. Nothing that is detectable from userspace
(except for perhaps /proc entries).

It doesn't look like the problem is with the task that is spinning
until the others have stopped. Instead, it looks like one of the
other tasks is spinning somewhere in between the time that it wakes up
its successor and the time that it sends it self the STOP signal. It
is clearly getting preempted but then makes no progress (sometimes for
a VERY long time).


Well it is a bit difficult to help further because you haven't posted
the working code or said what you are trying to do.

Stick a few printks around the place or try a kernel debugger to see
if you can't work out what is going wrong. Compiling the kernel with
debug info can allow you to find out what line of code EIP is pointing
to, which can also be helpful.

--
SUSE Labs, Novell Inc.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/