Re: [RFC -v3 2/2] watchdog: update watchdog_tresh properly

From: Michal Hocko
Date: Tue Jul 23 2013 - 10:51:14 EST


On Tue 23-07-13 10:44:08, Don Zickus wrote:
> On Tue, Jul 23, 2013 at 04:07:29PM +0200, Michal Hocko wrote:
> > On Tue 23-07-13 09:53:34, Don Zickus wrote:
> > > On Mon, Jul 22, 2013 at 04:32:46PM +0200, Michal Hocko wrote:
> > > > The nmi one is disabled and then reinitialized from scratch. This
> > > > has an unpleasant side effect that the allocation of the new event might
> > > > fail theoretically so the hard lockup detector would be disabled for
> > > > such cpus. On the other hand such a memory allocation failure is very
> > > > unlikely because the original event is deallocated right before.
> > > > It would be much nicer if we just changed perf event period but there
> > > > doesn't seem to be any API to do that right now.
> > > > It is also unfortunate that perf_event_alloc uses GFP_KERNEL allocation
> > > > unconditionally so we cannot use on_each_cpu() and do the same thing
> > > > from the per-cpu context. The update from the current CPU should be
> > > > safe because perf_event_disable removes the event atomically before
> > > > it clears the per-cpu watchdog_ev so it cannot change anything under
> > > > running handler feet.
> > >
> > > I guess I don't have a problem with this. I was hoping to have more
> > > shared code with the regular stop/start routines but with the pmu bit
> > > locking (to share pmus with oprofile), you really need to unregister
> > > everything to stop the lockup detector. This makes it a little too heavy
> > > for a restart routine like this.
> >
> > I am not sure I understand the above. Regular stop/start is about all
> > the machinery, I have tried to reduce the restarting to bare minimum.
> > Do you find the current version heavier than the full disable_all &&
> > enable_all?
>
> No, I find your restart mechanism lighter than full disable_all. I would
> love to have the lockup detector just disable itself on stop and re-enable
> on start. But because of oprofile, the lockup has to free up its event
> on stop and recreate it on start, which kinda sucks.
>
> Anyway it was just an aside.

Ohh, I see.

> > > The only odd thing is I can't figure out which version you were using to
> > > apply this patch. I can't find old_thresh (though I understand the idea
> > > of it).
> >
> > current Linus tree (linux-next - 20130723 - has it as well AFAICS)
>
> Ok. Thanks. Ah, I see. I forgot Frederic modified pieces there. The
> threading keeps changing. I see why you took your approach.
>
> Should be fine.
>
> Acked-by: Don Zickus <dzickus@xxxxxxxxxx>

Thanks! I will repost this without RFC if nobody else objects.

--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/