Re: [PATCH 1/3] [lockup detector] sync touch_*_watchdog back to oldsemantics

From: Ingo Molnar
Date: Wed Sep 01 2010 - 03:20:33 EST



* Cyrill Gorcunov <gorcunov@xxxxxxxxx> wrote:

> On 9/1/10, Cyrill Gorcunov <gorcunov@xxxxxxxxx> wrote:
> > On 9/1/10, Ingo Molnar <mingo@xxxxxxx> wrote:
> >>
> >> * Don Zickus <dzickus@xxxxxxxxxx> wrote:
> >>
> >>> void touch_nmi_watchdog(void)
> >>> {
> >>> - __get_cpu_var(watchdog_nmi_touch) = true;
> >>> + if (watchdog_enabled) {
> >>> + unsigned cpu;
> >>> +
> >>> + for_each_present_cpu(cpu) {
> >>> + if (per_cpu(watchdog_nmi_touch, cpu) != true)
> >>> + per_cpu(watchdog_nmi_touch, cpu) = true;
> >>> + }
> >>
> >> Hm, this is going to be a scalability nightmare with lots of CPUs. Not
> >> only do we have a nr_cpus loop, but we touch per-cpu areas of _other_
> >> CPUs - a big scalability nono.
> >>
> >> Why do we need to do this? We never needed to touch other CPU's NMI
> >> lockup accounting data areas - why has this changed? The changelog does
> >> not explain this.
> >>
> >> Thanks,
> >>
> >> Ingo
> >>
> > I believe this came from old nmi watchdog code where it might be
> > useful when nmi watchdog activated via io-apic. I'm trying to figure
> > out if we really need it still.
>
> Well, we can't drop it or make per-cpu specific, for example we need
> it in case of panic with watchdog enabled and panic timeout set, or
> boot delay set and etc. Seems same applies to printk_delay. Hmm...

Ok - can you cite the old watchdog code, did it really do a nr_cpus
loop?

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/