Re: [PATCH v2 4.19] tcp: fix TCP socks unreleased in BBR mode

From: Jason Xing
Date: Thu Jun 04 2020 - 09:48:18 EST


On Thu, Jun 4, 2020 at 9:10 PM Eric Dumazet <edumazet@xxxxxxxxxx> wrote:
>
> On Thu, Jun 4, 2020 at 2:01 AM <kerneljasonxing@xxxxxxxxx> wrote:
> >
> > From: Jason Xing <kerneljasonxing@xxxxxxxxx>
> >
> > When using BBR mode, too many tcp socks cannot be released because of
> > duplicate use of the sock_hold() in the manner of tcp_internal_pacing()
> > when RTO happens. Therefore, this situation maddly increases the slab
> > memory and then constantly triggers the OOM until crash.
> >
> > Besides, in addition to BBR mode, if some mode applies pacing function,
> > it could trigger what we've discussed above,
> >
> > Reproduce procedure:
> > 0) cat /proc/slabinfo | grep TCP
> > 1) switch net.ipv4.tcp_congestion_control to bbr
> > 2) using wrk tool something like that to send packages
> > 3) using tc to increase the delay and loss to simulate the RTO case.
> > 4) cat /proc/slabinfo | grep TCP
> > 5) kill the wrk command and observe the number of objects and slabs in
> > TCP.
> > 6) at last, you could notice that the number would not decrease.
> >
> > v2: extend the timer which could cover all those related potential risks
> > (suggested by Eric Dumazet and Neal Cardwell)
> >
> > Signed-off-by: Jason Xing <kerneljasonxing@xxxxxxxxx>
> > Signed-off-by: liweishi <liweishi@xxxxxxxxxxxx>
> > Signed-off-by: Shujin Li <lishujin@xxxxxxxxxxxx>
>
> That is not how things work really.
>
> I will submit this properly so that stable teams do not have to guess
> how to backport this to various kernels.
>
> Changelog is misleading, this has nothing to do with BBR, we need to be precise.
>

Thanks for your help. I can finally apply this patch into my kernel.

Looking forward to your patchset :)

Jason

> Thank you.