Re: [PATCH 3/3] [RFC] nmi_watchdog: config option to enable newnmi_watchdog

From: Ingo Molnar
Date: Fri Jan 29 2010 - 03:12:53 EST



* Don Zickus <dzickus@xxxxxxxxxx> wrote:

> On Thu, Jan 28, 2010 at 03:54:54PM +0100, Peter Zijlstra wrote:
> > On Wed, 2010-01-27 at 15:03 -0500, Don Zickus wrote:
> > > These are the bits that enable the new nmi_watchdog and safely isolate the
> > > old nmi_watchdog. Only one or the other can run, not both at the same
> > > time.
> >
> > perf disables the lapic watchdog when it wants the pmu, so there
> > shouldn't be a problem having both built in.
>
> Yes it does disable but does not prevent nmi_watchdog_tick from running nor
> the /proc interface from being loaded. So perhaps my description isn't very
> good. The idea with the new watchdog was to re-use some of the bits of the
> old one, but having them both compiled in seemed to stomp on each other.
> That is what I was trying to prevent.
>
> I can certainly change the behaviour, just makes the code a little more
> messy I think.

I think that's a good idea - and i think we want to be bold and just have the
new code run seemlessly. (and fix bugs, if any.)

In fact we want to be even bolder: how about enabling the NMI watchdog by
default again?

The problem with the old one was its fragility - but now if we have a PMU
driver active and perf events enabled we might as well use your brand new NMI
watchdog code as a testing facility as well: if there's _any_ problem with
NMIs then regular 'perf' use would trigger it too - except that not all people
run perf while an always-enabled NMI watchdog would.

And it would detect hard hangs too.

What do you think?

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/