RE: [PATCH net-next v2 01/17] net: Copy slab data for sendmsg(MSG_SPLICE_PAGES)
From: Willem de Bruijn
Date: Sun Jun 18 2023 - 12:43:26 EST
David Howells wrote:
> If sendmsg() is passed MSG_SPLICE_PAGES and is given a buffer that contains
> some data that's resident in the slab, copy it rather than returning EIO.
> This can be made use of by a number of drivers in the kernel, including:
> iwarp, ceph/rds, dlm, nvme, ocfs2, drdb. It could also be used by iscsi,
> rxrpc, sunrpc, cifs and probably others.
>
> skb_splice_from_iter() is given it's own fragment allocator as
> page_frag_alloc_align() can't be used because it does no locking to prevent
> parallel callers from racing. alloc_skb_frag() uses a separate folio for
> each cpu and locks to the cpu whilst allocating, reenabling cpu migration
> around folio allocation.
>
> This could allocate a whole page instead for each fragment to be copied, as
> alloc_skb_with_frags() would do instead, but that would waste a lot of
> space (most of the fragments look like they're going to be small).
>
> This allows an entire message that consists of, say, a protocol header or
> two, a number of pages of data and a protocol footer to be sent using a
> single call to sock_sendmsg().
>
> The callers could be made to copy the data into fragments before calling
> sendmsg(), but that then penalises them if MSG_SPLICE_PAGES gets ignored.
>
> Signed-off-by: David Howells <dhowells@xxxxxxxxxx>
> cc: Alexander Duyck <alexander.duyck@xxxxxxxxx>
> cc: Eric Dumazet <edumazet@xxxxxxxxxx>
> cc: "David S. Miller" <davem@xxxxxxxxxxxxx>
> cc: David Ahern <dsahern@xxxxxxxxxx>
> cc: Jakub Kicinski <kuba@xxxxxxxxxx>
> cc: Paolo Abeni <pabeni@xxxxxxxxxx>
> cc: Jens Axboe <axboe@xxxxxxxxx>
> cc: Matthew Wilcox <willy@xxxxxxxxxxxxx>
> cc: Menglong Dong <imagedong@xxxxxxxxxxx>
> cc: netdev@xxxxxxxxxxxxxxx
> ---
>
> Notes:
> ver #2)
> - Fix parameter to put_cpu_ptr() to have an '&'.
>
> include/linux/skbuff.h | 5 ++
> net/core/skbuff.c | 171 ++++++++++++++++++++++++++++++++++++++++-
> 2 files changed, 173 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index 91ed66952580..0ba776cd9be8 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -5037,6 +5037,11 @@ static inline void skb_mark_for_recycle(struct sk_buff *skb)
> #endif
> }
>
> +void *alloc_skb_frag(size_t fragsz, gfp_t gfp);
> +void *copy_skb_frag(const void *s, size_t len, gfp_t gfp);
> +ssize_t skb_splice_from_iter(struct sk_buff *skb, struct iov_iter *iter,
> + ssize_t maxsize, gfp_t gfp);
> +
> ssize_t skb_splice_from_iter(struct sk_buff *skb, struct iov_iter *iter,
> ssize_t maxsize, gfp_t gfp);
>
duplicate declaration
(no need to respin just for this, imho)