Re: INFO: rcu detected stall in sys_sendfile64 (2)

From: Dmitry Vyukov
Date: Tue Mar 12 2019 - 13:17:14 EST


On Tue, Mar 12, 2019 at 3:30 PM Tetsuo Handa
<penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote:
>
> (Moving most recipients to bcc: in order to avoid flooding.)
>
> On 2019/03/12 13:08, Al Viro wrote:
> > Umm... Might be a good idea to add some plausibility filters - it is,
> > in theory, possible that adding a line in a comment changes behaviour
> > (without compiler bugs, even - playing with __LINE__ is all it would
> > take), but the odds that it's _not_ a false positive are very low.
>
> Well, 108 out of 168 tests done during this bisection failed to test.
> With such high failure ratio, it is possible that by chance no crash
> happened during few tests for specific commit; causing a wrong bisection
> result. I expect that when trying to conclude "git bisect good" for
> specific commit, the tests should be repeated until no crash happened
> during 8 successful tests.

Added to https://github.com/google/syzkaller/issues/1051:

Tetsuo points out that if lots (say, 7/8) tests failed with infra
problems, then we should retry/skip or something. This zeroes the
effect of having multiple independent tests.

Thanks.

> Also, this bisection is finding multiple different crash patterns, which
> suggests that the crashed tests are not giving correct feedback to syzbot.

Treating different crashes as just "crash" is intended. Kernel bugs
can manifest in very different ways.
Want fun, search for "bpf: sockhash, disallow bpf_tcp_close and update
in parallel" in https://syzkaller.appspot.com/?fixed=upstream
It lead to 50+ different failure modes.

> $ grep -F 'run #' bisect.txt\?x\=13220283200000 | wc -l
> 168
> $ grep -F 'Connection timed out' bisect.txt\?x\=13220283200000 | wc -l
> 108
> $ grep -F 'crashed' bisect.txt\?x\=13220283200000
> run #0: crashed: WARNING: ODEBUG bug in netdev_freemem
> run #0: crashed: WARNING: ODEBUG bug in netdev_freemem
> run #1: crashed: INFO: rcu detected stall in corrupted
> run #0: crashed: INFO: rcu detected stall in sys_sendfile64
> run #0: crashed: INFO: rcu detected stall in corrupted
> run #4: crashed: INFO: rcu detected stall in sys_sendfile64
> run #0: crashed: INFO: rcu detected stall in corrupted
> run #1: crashed: INFO: rcu detected stall in corrupted
> run #0: crashed: INFO: rcu detected stall in ext4_file_write_iter
> run #0: crashed: INFO: rcu detected stall in corrupted
> run #0: crashed: INFO: rcu detected stall in sendfile64
> run #0: crashed: INFO: rcu detected stall in corrupted
> run #1: crashed: INFO: rcu detected stall in sendfile64
> run #0: crashed: INFO: rcu detected stall in ext4_file_write_iter
> run #1: crashed: INFO: rcu detected stall in corrupted
> run #0: crashed: INFO: rcu detected stall in corrupted
> run #0: crashed: INFO: rcu detected stall in corrupted
> run #0: crashed: INFO: rcu detected stall in corrupted
> run #1: crashed: INFO: rcu detected stall in corrupted
> run #0: crashed: INFO: rcu detected stall in corrupted
> run #3: crashed: INFO: rcu detected stall in corrupted
> run #0: crashed: INFO: rcu detected stall in do_iter_write