[PATCH] watchdog: make sure the watchdog thread gets CPU on loadedsystem

From: Michal Hocko
Date: Tue Mar 13 2012 - 05:34:44 EST


If the system is heavily loaded while hotplugging a CPU, we might end up
with a bogus hardlockup detection. This has been seen during LTP pounder
test executed in parallel with the hotplug test.

Hard lockup detector consist of two parts
- watchdog_overflow_callback (executed as a perf counter callback
from NMI) which checks whether per-cpu hrtimer_interrupts changed
since the last time it run and panics if not
- watchdog kernel thread which starts watchdog_hrtimer which
periodically updates hrtimer_interrupts.

The main problem is that watchdog_enable (called when a CPU is brought up)
registers a perf event but the hrtimer is started later when the watchdog
thread gets a chance to run.

The watchdog thread starts with a normal priority currently and boosts
itself as soon as it gets to a CPU. This might be, however, already too
late as demonstrated with the LTP pounder test executed in parallel by
LTP hotplug test. There are zillions of userspace processes sitting in
the runque while the number of online CPUs gets down to 1. CPUs are
onlined back in the second stage where the issue triggers.

When we online a CPU and create the watchdog kernel thread it will take
some time until it gets to a CPU. On the other hand the perf counter
callback is executed in the timely fashion so we explode the first time
it finds out that the hrtimer_interrupts wasn't incremented.

Let's fix this by boosting the watchdog thread priority before we wake it up
rather than when it's already running.
This still doesn't handle a case where we have the same amount of high prio
FIFO tasks but that doesn't seem to be common. The current implementation
doesn't handle that case anyway so this is no worse at least.

Unfortunately, we cannot start perf counter from the watchdog thread
because we could miss a real lock up and also we cannot start the
hrtimer from watchdog_enable because we there is no way (at least I
don't know any) to start a hrtimer from a different CPU.
--
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9
Czech Republic
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/