Re: [PATCH] sched: watchdog: Touch kernel watchdog in sched code

From: Thomas Gleixner
Date: Thu Mar 05 2020 - 13:07:29 EST


Peter Zijlstra <peterz@xxxxxxxxxxxxx> writes:

> On Wed, Mar 04, 2020 at 01:39:41PM -0800, Xi Wang wrote:
>> The main purpose of kernel watchdog is to test whether scheduler can
>> still schedule tasks on a cpu. In order to reduce latency from
>> periodically invoking watchdog reset in thread context, we can simply
>> touch watchdog from pick_next_task in scheduler. Compared to actually
>> resetting watchdog from cpu stop / migration threads, we lose coverage
>> on: a migration thread actually get picked and we actually context
>> switch to the migration thread. Both steps are heavily protected by
>> kernel locks and unlikely to silently fail. Thus the change would
>> provide the same level of protection with less overhead.
>>
>> The new way vs the old way to touch the watchdogs is configurable
>> from:
>>
>> /proc/sys/kernel/watchdog_touch_in_thread_interval
>>
>> The value means:
>> 0: Always touch watchdog from pick_next_task
>> 1: Always touch watchdog from migration thread
>> N (N>0): Touch watchdog from migration thread once in every N
>> invocations, and touch watchdog from pick_next_task for
>> other invocations.
>>
>
> This is configurable madness. What are we really trying to do here?

Create yet another knob which will be advertised in random web blogs to
solve all problems of the world and some more. Like the one which got
silently turned into a NOOP ~10 years ago :)