Re: [PATCH] Softlockup (out of cpu) killer

From: Frederic Weisbecker
Date: Sun Dec 11 2011 - 19:28:38 EST


On Sun, Dec 11, 2011 at 02:48:55PM -0800, Vincent Li wrote:
> In kernel, there is out of memory (OOM) killer, why not make an out of cpu (OOC) killer?
> I tested following patch by running an user-space cpu hogging process and the softlockukp
> detector killed the process successfully.
>
> Softlockup could be caused by user-space process hogging cpu, add softlockup_kill kernel
> config to allow kernel to kill the user space cpu hogging process. this feature is
> useful for high availability systems that have uptime gurantees and where a softlockup
> must be resolved ASAP
>
> echo 1 > /proc/sys/kernel/softlockukp_kill to enable cpu hog process killer
> echo 0 > /proc/sys/kernel/softlockup_kill to disable cpu hog process killer

That assumes a signal would be enough to pull a process out of its softlockup.
I believe this is seldom the case. A process in a softlockup is stuck in some
place that has preemption disabled. Unless it luckily polls there for pending
signals, that won't work.

But may be that happens more often than I think. May be other people have
more insight.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/