Re: upstream boot error: BUG: soft lockup in __do_softirq

From: Randy Dunlap
Date: Fri Jul 31 2020 - 12:08:22 EST


On 7/30/20 11:50 PM, Dmitry Vyukov wrote:
> On Fri, Jul 31, 2020 at 8:44 AM syzbot
> <syzbot+8472ea265fe32cc3bf78@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
>>
>> Hello,
>>
>> syzbot found the following issue on:
>>
>> HEAD commit: 92ed3019 Linux 5.8-rc7
>> git tree: upstream
>> console output: https://syzkaller.appspot.com/x/log.txt?x=10e84cdf100000
>> kernel config: https://syzkaller.appspot.com/x/.config?x=b45e47f6d958ae82
>> dashboard link: https://syzkaller.appspot.com/bug?extid=8472ea265fe32cc3bf78
>> compiler: gcc (GCC) 10.1.0-syz 20200507
>>
>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>> Reported-by: syzbot+8472ea265fe32cc3bf78@xxxxxxxxxxxxxxxxxxxxxxxxx
>
> This is a qemu-kvm instance killing the host kernel somehow, the host
> kernel itself running qemu's is full of rcu stalls. I think this is
> not a bug in the tested kernel.
> We change rcu stall timeout to 120 seconds from the default 21s, but
> this happens only after boot using sysctls. I did not find any way to
> change the rcu timeout via cmdline/config (would be useful).

(adding Paul)


Documentation/RCU/stallwarn.rst says there is a Kconfig:

CONFIG_RCU_CPU_STALL_TIMEOUT

This kernel configuration parameter defines the period of time
that RCU will wait from the beginning of a grace period until it
issues an RCU CPU stall warning. This time period is normally
21 seconds.

and Documentation/admin-guide/kernel-parameters.txt has 2 RCU stall timeouts,
one for CPU and one for tasks:

rcupdate.rcu_cpu_stall_timeout= [KNL]
Set timeout for RCU CPU stall warning messages.

rcupdate.rcu_task_stall_timeout= [KNL]
Set timeout in jiffies for RCU task stall warning
messages. Disable with a value less than or equal
to zero.


--
~Randy