Re: fix BUG: using smp_processor_id() in touch_nmi_watchdog andtouch_softlockup_watchdog

From: Cyrill Gorcunov
Date: Wed Aug 18 2010 - 17:44:57 EST


On Wed, Aug 18, 2010 at 12:33:46PM -0700, Andrew Morton wrote:
> On Fri, 13 Aug 2010 13:21:58 +0300
> Sergey Senozhatsky <sergey.senozhatsky@xxxxxxxxx> wrote:
>
> > Hello,
> >
> > Got this traces today:
> >
> > ...
> >

...

> > void touch_softlockup_watchdog(void)
> > {
> > - __get_cpu_var(watchdog_touch_ts) = 0;
> > + int this_cpu = get_cpu();
> > + per_cpu(watchdog_touch_ts, this_cpu) = 0;
> > + put_cpu();
> > }
> > EXPORT_SYMBOL(touch_softlockup_watchdog);
> >
> > @@ -142,7 +143,9 @@ void touch_all_softlockup_watchdogs(void)
> > #ifdef CONFIG_HARDLOCKUP_DETECTOR
> > void touch_nmi_watchdog(void)
> > {
> > - __get_cpu_var(watchdog_nmi_touch) = true;
> > + int this_cpu = get_cpu();
> > + per_cpu(watchdog_nmi_touch, this_cpu) = true;
> > + put_cpu();
> > touch_softlockup_watchdog();
> > }
> > EXPORT_SYMBOL(touch_nmi_watchdog);
>
> Why did this start happening? Surely we've called
> touch_softlockup_watchdog() from within preemptible code before now.

indeed, and we've been using __raw interface before (2.6.18)

void touch_softlockup_watchdog(void)
{
__raw_get_cpu_var(touch_timestamp) = jiffies;
}

> Methinks that
>
> : commit 26e09c6eee14f4827b55137ba0eedc4e77cd50ab
> : Author: Don Zickus <dzickus@xxxxxxxxxx>
> : AuthorDate: Mon May 17 18:06:04 2010 -0400
> : Commit: Frederic Weisbecker <fweisbec@xxxxxxxxx>
> : CommitDate: Wed May 19 11:32:14 2010 +0200
> :
> : lockup_detector: Convert per_cpu to __get_cpu_var for readability
>
> was simply broken? That would be strange, given that it's been sitting
> around since May 17.
>
> If we don't want to revert 26e09c6eee14f4827b55137ba0eedc4e77cd50ab
> then I'd suggest that we simply switch to raw_smp_processor_id(): those
> newly-added get_cpu/put_cpu calls don't do anything useful.
>

I think it's fine to use __get_cpu_var in touch_nmi_watchdog (which should not
be called with irq enabled since it ticks anyway then, at least on x86) for hardware
nmi watchdog, can't conclude anything about softlockup (except that we had __raw
interface before) since I'm not familiar with soflockup watchdog at moment.

-- Cyrill
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/