>No, if we forgot to add the process to the run-queue, it would still
>have been marked as TASK_RUNNABLE - even though it would never have been
>actually run. And you said that the stuck processes are always stuck in
>disk wait according to "ps"... So wake_up_process() was never called at
>all.
Well, they were certainly not marked as running or suspended, but I never
did say that they were marked as being in disk wait according to ps.
Actually, ps showed them in a state designated with a dot, like in:
PID TT STAT TIME
185 ? . 56:02 /usr/sbin/innd -p4 -r -i0 -c4 -L
I'm not sure where the dot comes from, or what it should designate.
(I'm using proc-ps as in the bo distribution of Debian).
I forgot to check the current->state from within kdebug, but that's
because current was not in the context (so gdb told me).
> - something clears the locked state without waking people up. Do you
> use "md" or anything else that plays around with buffers?
Which still makes me kind of wonder why my rearrangement fixes things.
The only behaviour changed here apparently is that *if*
during the execution of run_task_queue(&tq_disk) current->state is altered,
then we don't overwrite it before jumping into schedule().
> - really strange K5 bug
Which would be even more difficult to explain in the light of my
patch.
-- Sincerely, srb@cuci.nl Stephen R. van den Berg (AKA BuGless).He did a quarter of the work in *half* the time!