RE: [PATCH] RDMA/siw: Fix the sendmsg byte count in siw_tcp_sendpages

From: Bernard Metzler
Date: Mon Jul 28 2025 - 07:36:47 EST




> -----Original Message-----
> From: Pedro Falcato <pfalcato@xxxxxxx>
> Sent: Friday, 25 July 2025 18:10
> To: Bernard Metzler <BMT@xxxxxxxxxxxxxx>
> Cc: Jason Gunthorpe <jgg@xxxxxxxx>; Leon Romanovsky <leon@xxxxxxxxxx>;
> Vlastimil Babka <vbabka@xxxxxxx>; Jakub Kicinski <kuba@xxxxxxxxxx>; David
> Howells <dhowells@xxxxxxxxxx>; Tom Talpey <tom@xxxxxxxxxx>; linux-
> rdma@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx;
> stable@xxxxxxxxxxxxxxx; kernel test robot <oliver.sang@xxxxxxxxx>
> Subject: [EXTERNAL] Re: [PATCH] RDMA/siw: Fix the sendmsg byte count in
> siw_tcp_sendpages
>
> On Wed, Jul 23, 2025 at 04:49:30PM +0000, Bernard Metzler wrote:
> >
> >
> > > -----Original Message-----
> > > From: Pedro Falcato <pfalcato@xxxxxxx>
> > > Sent: Wednesday, 23 July 2025 17:49
> > > To: Bernard Metzler <BMT@xxxxxxxxxxxxxx>
> > > Cc: Jason Gunthorpe <jgg@xxxxxxxx>; Leon Romanovsky <leon@xxxxxxxxxx>;
> > > Vlastimil Babka <vbabka@xxxxxxx>; Jakub Kicinski <kuba@xxxxxxxxxx>;
> David
> > > Howells <dhowells@xxxxxxxxxx>; Tom Talpey <tom@xxxxxxxxxx>; linux-
> > > rdma@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx;
> > > stable@xxxxxxxxxxxxxxx; kernel test robot <oliver.sang@xxxxxxxxx>
> > > Subject: [EXTERNAL] Re: [PATCH] RDMA/siw: Fix the sendmsg byte count in
> > > siw_tcp_sendpages
> > >
> > > On Wed, Jul 23, 2025 at 02:52:12PM +0000, Bernard Metzler wrote:
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Pedro Falcato <pfalcato@xxxxxxx>
> > > > > Sent: Wednesday, 23 July 2025 12:41
> > > > > To: Jason Gunthorpe <jgg@xxxxxxxx>; Bernard Metzler
> > > <BMT@xxxxxxxxxxxxxx>;
> > > > > Leon Romanovsky <leon@xxxxxxxxxx>; Vlastimil Babka <vbabka@xxxxxxx>
> > > > > Cc: Jakub Kicinski <kuba@xxxxxxxxxx>; David Howells
> > > <dhowells@xxxxxxxxxx>;
> > > > > Tom Talpey <tom@xxxxxxxxxx>; linux-rdma@xxxxxxxxxxxxxxx; linux-
> > > > > kernel@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx; Pedro Falcato
> > > > > <pfalcato@xxxxxxx>; stable@xxxxxxxxxxxxxxx; kernel test robot
> > > > > <oliver.sang@xxxxxxxxx>
> > > > [snip]
> > > > > ---
> > > > > drivers/infiniband/sw/siw/siw_qp_tx.c | 4 ++--
> > > > > 1 file changed, 2 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c
> > > > > b/drivers/infiniband/sw/siw/siw_qp_tx.c
> > > > > index 3a08f57d2211..9576a2b766c4 100644
> > > > > --- a/drivers/infiniband/sw/siw/siw_qp_tx.c
> > > > > +++ b/drivers/infiniband/sw/siw/siw_qp_tx.c
> > > > > @@ -340,11 +340,11 @@ static int siw_tcp_sendpages(struct socket *s,
> > > struct
> > > > > page **page, int offset,
> > > > > if (!sendpage_ok(page[i]))
> > > > > msg.msg_flags &= ~MSG_SPLICE_PAGES;
> > > > > bvec_set_page(&bvec, page[i], bytes, offset);
> > > > > - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
> > > > > + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, bytes);
> > > > >
> > > > > try_page_again:
> > > > > lock_sock(sk);
> > > > > - rv = tcp_sendmsg_locked(sk, &msg, size);
> > > > > + rv = tcp_sendmsg_locked(sk, &msg, bytes);
> > > > > release_sock(sk);
> > > > >
> > > >
> > > > Pedro, many thanks for catching this! I completely
> > > > missed it during my too sloppy review of that patch.
> > > > It's a serious bug which must be fixed asap.
> > > > BUT, looking closer, I do not see the offset being taken
> > > > into account when retrying a current segment. So,
> > > > resend attempts seem to send old data which are already
> > > > out. Shouldn't the try_page_again: label be above
> > > > bvec_set_page()??
> > >
> > > This was raised off-list by Vlastimil - I think it's harmless to bump
> (but
> > > not use)
> > > the offset here, because by reusing the iov_iter we progressively
> consume
> > > the data
> > > (it keeps its own size and offset tracking internally). So the only
> thing
> > > we
> > > need to track is the size we pass to tcp_sendmsg_locked[1].
> > >
>
> Hi,
>
> Sorry for the delay.
>
> > Ah okay, I didn't know that. Are we sure? I am currently travelling and
> have
> > only limited possibilities to try out things. I just looked up other
>
> I'm not 100% sure, and if some more authoritative voice (David, or Jakub for
> the net side) could confirm my analysis, it would be great.
>
> > use cases and found one in net/tls/tls_main.c#L197. Here the loop looks
> > very similar, but it works as I was suggesting (taking offset into account
> > and re-initializing new bvec in case of partial send).
> >
> > > If desired (and if my logic is correct!) I can send a v2 deleting that
> bit.
> > >
> >
> > So yes if that's all save, please. We shall not have dead code.
>
> Understood. I'll send a v2 resetting the bvec and iov_iter if we get no
> further
> feedback in the meanwhile.

Let's see if we get input on it. Even if that is all save, I think
we shall put in a short comment explaining why we do not re-initiate
the bvec.

Please include my new maintainer email address in further iterations,
and stop sending to the old. My IBM address goes away end of month.
Thank you!

Bernard.
>
> >
> > Thanks!
> > Bernard.
> > >
> > > [1] Assuming tcp_sendmsg_locked guarantees it will never consume
> something
> > > out
> > > of the iovec_iter without reporting it as bytes copied, which from a
> code
> > > reading
> > > it seems like it won't...
> > >
> > >
> > > --
> > > Pedro
>
> --
> Pedro