Re: uninterruptible sleep lockups
From: Chris Friesen
Date: Tue Feb 22 2005 - 19:27:10 EST
linux-os wrote:
Before I get into the reply, I just want to make it clear that I'm not
arguing that we *should* do any of this, just that it is not technically
impossible. It's a thought experiment, not a design suggestion.
All wonderful. However, it dosn't fix the problem. You are,
again, assuming that the problem is the symptom! The problem
is that some piece of code is not handling an exception
properly. It is waiting forever for something that will
never happen. It's that CODE that needs to be fixed.
Absolutely. I'm just theorizing that it is possible to devise a system
that would be able to deal with such a situation, analogous to the way
the kernel can deal with bugs in userspace processes (segfaults, traps,
etc.).
"Cleaning" up the immediate symptoms doesn't let
the next thread that acquires the "cleaned up" lock
use the hardware because it has jammed code between
that thread and the hardware.
If the system is designed such that all resources are tracked, then you
could clean them up when the "hung" entity is killed (the way we do it
for userspace resources). In this case there is no more jammed code.
The next guy to aquire the mutex knows the hardware is in an
undetermined state, and is responsable for reinitializing it to a known
state. This would be horribly complicated, but I don't think it would
be impossible.
The bad code needs to be fixed. If the bad code is
fixed, you will __never__ have a process stuck
in 'D' state unless you run for the 1000 years
that could statistically result in a bit in
the semaphore getting flipped.
I don't disagree with you on this. I think that fixing the bad code is
absolutely the way to go. I'm simply indulging in a thought experiment
as to whether or not it is theoretically possible to create a system
that would be able to clean up after this sort of thing once it has
happened.
Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/