Re: [PATCH v3] doc/rcutorture: Add description of rcutorture.stall_cpu_block

From: Paul E. McKenney
Date: Wed Mar 22 2023 - 19:57:26 EST


On Tue, Mar 21, 2023 at 10:12:34AM +0800, Zqiang wrote:
> For kernels built with CONFIG_PREEMPTION=n and CONFIG_PREEMPT_COUNT=y,
> run the RCU stall tests.
>
> runqemu kvm slirp nographic qemuparams="-m 1024 -smp 4"
> bootparams="console=ttyS0 rcutorture.stall_cpu=30
> rcutorture.stall_no_softlockup=1 rcutorture.stall_cpu_block=1" -d
>
> [ 10.841071] rcu-torture: rcu_torture_stall begin CPU stall
> [ 10.841073] rcu_torture_stall start on CPU 3.
> [ 10.841077] BUG: scheduling while atomic: rcu_torture_sta/66/0x0000000
> ....
> [ 10.841108] Call Trace:
> [ 10.841110] <TASK>
> [ 10.841112] dump_stack_lvl+0x64/0xb0
> [ 10.841118] dump_stack+0x10/0x20
> [ 10.841121] __schedule_bug+0x8b/0xb0
> [ 10.841126] __schedule+0x2172/0x2940
> [ 10.841157] schedule+0x9b/0x150
> [ 10.841160] schedule_timeout+0x2e8/0x4f0
> [ 10.841192] schedule_timeout_uninterruptible+0x47/0x50
> [ 10.841195] rcu_torture_stall+0x2e8/0x300
> [ 10.841199] kthread+0x175/0x1a0
> [ 10.841206] ret_from_fork+0x2c/0x50
>
> Due to invoke schedule_timeout() forces the CPU to go through a
> quiescent state, cause RCU stall not appear and also cause scheduling
> while atomic complaints. so this commit add description of
> rcutorture.stall_cpu_block, it should not to be set in CONFIG_PREEMPTION=n
> kernels.
>
> Signed-off-by: Zqiang <qiang1.zhang@xxxxxxxxx>

Very good, thank you!

I did the usual wordsmithing, so please check for any errors that I may
have introduced.

Thanx, Paul

------------------------------------------------------------------------

commit 1e7cded7383986d788cb2020bd4e56b298518de2
Author: Zqiang <qiang1.zhang@xxxxxxxxx>
Date: Tue Mar 21 10:12:34 2023 +0800

doc/rcutorture: Add description of rcutorture.stall_cpu_block

If you build a kernel with CONFIG_PREEMPTION=n and CONFIG_PREEMPT_COUNT=y,
then run the rcutorture tests specifying stalls as follows:

runqemu kvm slirp nographic qemuparams="-m 1024 -smp 4" \
bootparams="console=ttyS0 rcutorture.stall_cpu=30 \
rcutorture.stall_no_softlockup=1 rcutorture.stall_cpu_block=1" -d

The tests will produce the following splat:

[ 10.841071] rcu-torture: rcu_torture_stall begin CPU stall
[ 10.841073] rcu_torture_stall start on CPU 3.
[ 10.841077] BUG: scheduling while atomic: rcu_torture_sta/66/0x0000000
....
[ 10.841108] Call Trace:
[ 10.841110] <TASK>
[ 10.841112] dump_stack_lvl+0x64/0xb0
[ 10.841118] dump_stack+0x10/0x20
[ 10.841121] __schedule_bug+0x8b/0xb0
[ 10.841126] __schedule+0x2172/0x2940
[ 10.841157] schedule+0x9b/0x150
[ 10.841160] schedule_timeout+0x2e8/0x4f0
[ 10.841192] schedule_timeout_uninterruptible+0x47/0x50
[ 10.841195] rcu_torture_stall+0x2e8/0x300
[ 10.841199] kthread+0x175/0x1a0
[ 10.841206] ret_from_fork+0x2c/0x50

This is because the rcutorture.stall_cpu_block=1 module parameter causes
rcu_torture_stall() to invoke schedule_timeout_uninterruptible() within
an RCU read-side critical section. This in turn results in a quiescent
state (which prevents the stall) and a sleep in an atomic context (which
produces the above splat).

Although this code is operating as designed, the design has proven to
be counterintuitive to many. This commit therefore updates the description
in kernel-parameters.txt accordingly.

Signed-off-by: Zqiang <qiang1.zhang@xxxxxxxxx>
Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx>

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 51067903af84..b39a4ab56b95 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5058,8 +5058,16 @@

rcutorture.stall_cpu_block= [KNL]
Sleep while stalling if set. This will result
- in warnings from preemptible RCU in addition
- to any other stall-related activity.
+ in warnings from preemptible RCU in addition to
+ any other stall-related activity. Note that
+ in kernels built with CONFIG_PREEMPTION=n and
+ CONFIG_PREEMPT_COUNT=y, this parameter will
+ cause the CPU to pass through a quiescent state.
+ Any such quiescent states will suppress RCU CPU
+ stall warnings, but the time-based sleep will
+ also result in scheduling-while-atomic splats.
+ Which might or might not be what you want.
+

rcutorture.stall_cpu_holdoff= [KNL]
Time to wait (s) after boot before inducing stall.