Re: INFO: rcu detected stall in corrupted

From: Xin Long
Date: Thu May 24 2018 - 04:10:17 EST


On Thu, May 24, 2018 at 7:13 AM, Marcelo Ricardo Leitner
<marcelo.leitner@xxxxxxxxx> wrote:
> On Mon, May 21, 2018 at 11:13:46AM -0700, Eric Dumazet wrote:
>>
>>
>> On 05/21/2018 11:09 AM, David Miller wrote:
>> > From: syzbot <syzbot+f116bc1994efe725d51b@xxxxxxxxxxxxxxxxxxxxxxxxx>
>> > Date: Mon, 21 May 2018 11:05:02 -0700
>> >
>> >> find_match+0x244/0x13a0 net/ipv6/route.c:691
>> >> find_rr_leaf net/ipv6/route.c:729 [inline]
>> >> rt6_select net/ipv6/route.c:779 [inline]
>> >
>> > Hmmm, endless loop in find_rr_leaf or similar?
>> >
>>
>>
>> I do not think so, this really looks like SCTP specific
>> , we now have dozens of traces all sharing :
>>
>> sctp_transport_route+0xad/0x450 net/sctp/transport.c:293
>> sctp_packet_config+0xb89/0xfd0 net/sctp/output.c:123
>> sctp_outq_flush+0x79c/0x4370 net/sctp/outqueue.c:894
>> sctp_outq_uncork+0x6a/0x80 net/sctp/outqueue.c:776
>> sctp_cmd_interpreter net/sctp/sm_sideeffect.c:1820 [inline]
>> sctp_side_effects net/sctp/sm_sideeffect.c:1220 [inline]
>> sctp_do_sm+0x596/0x7160 net/sctp/sm_sideeffect.c:1191
>> sctp_generate_heartbeat_event+0x218/0x450 net/sctp/sm_sideeffect.c:406
>> call_timer_fn+0x230/0x940 kernel/time/timer.c:1326
>>
>>
>> Some kind of infinite loop.
>>
>> When the hrtimer fires, it can point to any code that sits below but does not necessarily have a bug.
>
> Agreed. Xin Long identified the root cause. syzkaller is setting too
> aggressive parameters to SCTP RTO, leading to issues with the
> heartbeat timer.
Right, I will prepare a fix soon with your suggestion rto_min value "HZ/5"
Thanks.