Re: [PATCH] sched: watchdog: Touch kernel watchdog in sched code

From: Peter Zijlstra
Date: Fri Mar 06 2020 - 03:40:56 EST


On Thu, Mar 05, 2020 at 02:11:49PM -0800, Paul Turner wrote:
> The goal is to improve jitter since we're constantly periodically
> preempting other classes to run the watchdog. Even on a single CPU
> this is measurable as jitter in the us range. But, what increases the
> motivation is this disruption has been recently magnified by CPU
> "gifts" which require evicting the whole core when one of the siblings
> schedules one of these watchdog threads.
>
> The majority outcome being asserted here is that we could actually
> exercise pick_next_task if required -- there are other potential
> things this will catch, but they are much more braindead generally
> speaking (e.g. a bug in pick_next_task itself).

I still utterly hate what the patch does though; there is no way I'll
have watchdog code hook in the scheduler like this. That's just asking
for trouble.

Why isn't it sufficient to sample the existing context switch counters
from the watchdog? And why can't we fix that?