Re: [PATCH,RFC] RCU-based detection of stalled CPUs for Classic RCU

From: Ingo Molnar
Date: Thu Oct 02 2008 - 04:13:23 EST



* Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx> wrote:

> Hello!
>
> This patch adds stalled-CPU detection to Classic RCU. This capability
> is enabled by a new config variable CONFIG_RCU_CPU_STALL_DETECTOR,
> which defaults disabled. This is a debugging feature, not something
> that non-kernel-hackers would be expected to care about. This feature
> can detect looping CPUs in !PREEMPT builds and looping CPUs with
> preemption disabled in PREEMPT builds. This is essentially a port of
> this functionality from the treercu patch.
>
> One current shortcoming: on some systems, stalls are detected during
> early boot, when we normally would not care about them. My thought is
> to add a call from late initialization to suppress stall detection
> until the system is well along its way to being booted, but thought I
> should check to see if there might already be something for this
> purpose.

could you be a bit more specific, why do those warnings show up and why
dont we care about them? There are things like networking that
occasionally do an rcu_sync() and a stall could mean a bootup hang.

> (Currently against 2.6.27-rc8, FYI.)
>
> Thoughts?
>
> Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>

i think this is a very good idea in general - often the question comes
up whether a hang seen in the RCU code is indeed caused by RCU or other
factors. Could you perhaps rebase it against tip/core/rcu ? [or
tip/master for convenience]

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/