Re: [PATCH] watchdog: nohz: don't run watchdog on nohz_full cores

From: Mike Galbraith
Date: Mon Mar 30 2015 - 22:04:52 EST


On Mon, 2015-03-30 at 15:12 -0400, Don Zickus wrote:
> On Mon, Mar 30, 2015 at 02:51:05PM -0400, cmetcalf@xxxxxxxxxx wrote:
> > From: Chris Metcalf <cmetcalf@xxxxxxxxxx>
> >
> > Running watchdog can be a helpful debugging feature on regular
> > cores, but it's incompatible with nohz_full, since it forces
> > regular scheduling events. Accordingly, just exit out immediately
> > from any nohz_full core.
> >
> > An alternate approach would be to add a flags field or function to
> > smp_hotplug_thread to control on which cores the percpu threads
> > are created, but it wasn't clear that much mechanism was useful.
>
> Hi Chris,
>
> It seems like the correct solution would be to hook into the idle_loop
> somehow. If the cpu is idle, then it seems unlikely that a lockup could
> occur.
>
> My fear with this apporach is a lockup would occur on the nohz cpu and it
> would go undetected because that cpu is disabled. Further no printk is
> thrown out to even indicate a cpu is disabled making it more difficult to
> debug.

Hm, I don't see why this is needed, for debugging/testing you turn it
on, when you set up for critical operation, you turn it off.

A bigger deal is the clocksource watchdog methinks. Measurement
inspired me to make it dead yesterday.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/