Re: sched/core warning triggers on rcu torture test

From: Peter Zijlstra
Date: Tue Jun 26 2018 - 12:33:05 EST


On Tue, Jun 26, 2018 at 06:16:04PM +0200, Anna-Maria Gleixner wrote:
> Hi,
>
> during rcu torture tests (TREE04 and TREE07) I noticed, that a
> WARN_ON_ONCE() in sched core triggers on a recent 4.18-rc2 based
> kernel (6f0d349d922b ("Merge
> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net")) as well as
> on a 4.17.3.
>
> I'm running the tests on a machine with 144 cores:
>
> tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 144 --duration 120 --configs "9*TREE07"
> tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 144 --duration 120 --configs "18*TREE04"
>
>
> The warning was introduced by commit d84b31313ef8 ("sched/isolation:
> Offload residual 1Hz scheduler tick").
>
>
> Output looks similar for all tests I did (this one is the output of
> the 4.18-rc2 based kernel):
>
> WARNING: CPU: 11 PID: 906 at kernel/sched/core.c:3138 sched_tick_remote+0xb6/0xc0

That's nohz_full stuff, is that a normal part of rcutorture? In any
case, is the one housekeeping CPU getting seriously overloaded or
something?