RE: [PATCH] RDMA/siw: Fix the sendmsg byte count in siw_tcp_sendpages

From: Bernard Metzler
Date: Wed Jul 23 2025 - 12:50:17 EST




> -----Original Message-----
> From: Pedro Falcato <pfalcato@xxxxxxx>
> Sent: Wednesday, 23 July 2025 17:49
> To: Bernard Metzler <BMT@xxxxxxxxxxxxxx>
> Cc: Jason Gunthorpe <jgg@xxxxxxxx>; Leon Romanovsky <leon@xxxxxxxxxx>;
> Vlastimil Babka <vbabka@xxxxxxx>; Jakub Kicinski <kuba@xxxxxxxxxx>; David
> Howells <dhowells@xxxxxxxxxx>; Tom Talpey <tom@xxxxxxxxxx>; linux-
> rdma@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx;
> stable@xxxxxxxxxxxxxxx; kernel test robot <oliver.sang@xxxxxxxxx>
> Subject: [EXTERNAL] Re: [PATCH] RDMA/siw: Fix the sendmsg byte count in
> siw_tcp_sendpages
>
> On Wed, Jul 23, 2025 at 02:52:12PM +0000, Bernard Metzler wrote:
> >
> >
> > > -----Original Message-----
> > > From: Pedro Falcato <pfalcato@xxxxxxx>
> > > Sent: Wednesday, 23 July 2025 12:41
> > > To: Jason Gunthorpe <jgg@xxxxxxxx>; Bernard Metzler
> <BMT@xxxxxxxxxxxxxx>;
> > > Leon Romanovsky <leon@xxxxxxxxxx>; Vlastimil Babka <vbabka@xxxxxxx>
> > > Cc: Jakub Kicinski <kuba@xxxxxxxxxx>; David Howells
> <dhowells@xxxxxxxxxx>;
> > > Tom Talpey <tom@xxxxxxxxxx>; linux-rdma@xxxxxxxxxxxxxxx; linux-
> > > kernel@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx; Pedro Falcato
> > > <pfalcato@xxxxxxx>; stable@xxxxxxxxxxxxxxx; kernel test robot
> > > <oliver.sang@xxxxxxxxx>
> > [snip]
> > > ---
> > > drivers/infiniband/sw/siw/siw_qp_tx.c | 4 ++--
> > > 1 file changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c
> > > b/drivers/infiniband/sw/siw/siw_qp_tx.c
> > > index 3a08f57d2211..9576a2b766c4 100644
> > > --- a/drivers/infiniband/sw/siw/siw_qp_tx.c
> > > +++ b/drivers/infiniband/sw/siw/siw_qp_tx.c
> > > @@ -340,11 +340,11 @@ static int siw_tcp_sendpages(struct socket *s,
> struct
> > > page **page, int offset,
> > > if (!sendpage_ok(page[i]))
> > > msg.msg_flags &= ~MSG_SPLICE_PAGES;
> > > bvec_set_page(&bvec, page[i], bytes, offset);
> > > - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
> > > + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, bytes);
> > >
> > > try_page_again:
> > > lock_sock(sk);
> > > - rv = tcp_sendmsg_locked(sk, &msg, size);
> > > + rv = tcp_sendmsg_locked(sk, &msg, bytes);
> > > release_sock(sk);
> > >
> >
> > Pedro, many thanks for catching this! I completely
> > missed it during my too sloppy review of that patch.
> > It's a serious bug which must be fixed asap.
> > BUT, looking closer, I do not see the offset being taken
> > into account when retrying a current segment. So,
> > resend attempts seem to send old data which are already
> > out. Shouldn't the try_page_again: label be above
> > bvec_set_page()??
>
> This was raised off-list by Vlastimil - I think it's harmless to bump (but
> not use)
> the offset here, because by reusing the iov_iter we progressively consume
> the data
> (it keeps its own size and offset tracking internally). So the only thing
> we
> need to track is the size we pass to tcp_sendmsg_locked[1].
>
Ah okay, I didn't know that. Are we sure? I am currently travelling and have
only limited possibilities to try out things. I just looked up other
use cases and found one in net/tls/tls_main.c#L197. Here the loop looks
very similar, but it works as I was suggesting (taking offset into account
and re-initializing new bvec in case of partial send).

> If desired (and if my logic is correct!) I can send a v2 deleting that bit.
>

So yes if that's all save, please. We shall not have dead code.

Thanks!
Bernard.
>
> [1] Assuming tcp_sendmsg_locked guarantees it will never consume something
> out
> of the iovec_iter without reporting it as bytes copied, which from a code
> reading
> it seems like it won't...
>
>
> --
> Pedro