Re: [PATCH net-next v1] tcp: Correct signedness in skb remaining space calculation

From: Eric Dumazet
Date: Wed Jul 02 2025 - 11:41:12 EST


On Wed, Jul 2, 2025 at 8:28 AM Jiayuan Chen <jiayuan.chen@xxxxxxxxx> wrote:
>
> July 2, 2025 at 22:02, "Eric Dumazet" <edumazet@xxxxxxxxxx> wrote:
>
>
>
> >
> > On Wed, Jul 2, 2025 at 6:59 AM Eric Dumazet <edumazet@xxxxxxxxxx> wrote:
> >
> > >
> > > On Wed, Jul 2, 2025 at 6:42 AM Jiayuan Chen <jiayuan.chen@xxxxxxxxx> wrote:
> > >
> > > July 2, 2025 at 19:00, "Jiayuan Chen" <jiayuan.chen@xxxxxxxxx> wrote:
> > >
> > > >
> > >
> > > > The calculation for the remaining space, 'copy = size_goal - skb->len',
> > >
> > > >
> > >
> > > > was prone to an integer promotion bug that prevented copy from ever being
> > >
> > > >
> > >
> > > > negative.
> > >
> > > >
> > >
> > > > The variable types involved are:
> > >
> > > >
> > >
> > > > copy: ssize_t (long)
> > >
> > > >
> > >
> > > > size_goal: int
> > >
> > > >
> > >
> > > > skb->len: unsigned int
> > >
> > > >
> > >
> > > > Due to C's type promotion rules, the signed size_goal is converted to an
> > >
> > > >
> > >
> > > > unsigned int to match skb->len before the subtraction. The result is an
> > >
> > > >
> > >
> > > > unsigned int.
> > >
> > > >
> > >
> > > > When this unsigned int result is then assigned to the s64 copy variable,
> > >
> > > >
> > >
> > > > it is zero-extended, preserving its non-negative value. Consequently,
> > >
> > > >
> > >
> > > > copy is always >= 0.
> > >
> > > >
> > >
> > > To better explain this problem, consider the following example:
> > >
> > > '''
> > >
> > > #include <sys/types.h>
> > >
> > > #include <stdio.h>
> > >
> > > int size_goal = 536;
> > >
> > > unsigned int skblen = 1131;
> > >
> > > void main() {
> > >
> > > ssize_t copy = 0;
> > >
> > > copy = size_goal - skblen;
> > >
> > > printf("wrong: %zd\n", copy);
> > >
> > > copy = size_goal - (ssize_t)skblen;
> > >
> > > printf("correct: %zd\n", copy);
> > >
> > > return;
> > >
> > > }
> > >
> > > '''
> > >
> > > Output:
> > >
> > > '''
> > >
> > > wrong: 4294966701
> > >
> > > correct: -595
> > >
> > > '''
> > >
> > > Can you explain how one skb could have more bytes (skb->len) than size_goal ?
> > >
> > > If we are under this condition, we already have a prior bug ?
> > >
> > > Please describe how you caught this issue.
> > >
> >
> > Also, not sure why copy variable had to be changed from "int" to "ssize_t"
> >
> > A nicer patch (without a cast) would be to make it an "int" again/
> >
>
> I encountered this issue because I had tcp_repair enabled, which uses
> tcp_init_tso_segs to reset the MSS.
> However, it seems that tcp_bound_to_half_wnd also dynamically adjusts
> the value to be smaller than the current size_goal.
>

Okay, and what was the end result ?

An skb has a limited amount of bytes that can be put into it
(MAX_SKB_FRAGS * 32K) , and I can't see what are the effects of having
an
"not optimally sized skb in socket write queue".

BTW if you have a tcp_repair test, I would love having it in the
tools/testing/selftests/net :)

Thanks.

> Looking at the commit history, it's indeed unnecessary to define the
> copy variable as type ssize_t.