Re: [PATCH v3 1/1] nvmet-tcp: Don't kmap() pages which can't come from HIGHMEM

From: Sagi Grimberg
Date: Sun Aug 28 2022 - 10:31:53 EST



As you may have already read, I'm so new to kernel development that I still
know very little about many subsystems and drivers. I am not currently
able to tell the difference between BVEC and KVEC. I could probably try to
switch from one to the other (after learning from other code), however I won't
be able to explain in the commit message why users should better use BVEC in
this case.

struct kvec: pairs of form <kernel address, length>
struct bio_vec: triples of form <page, offset, length>

Either is a way to refer to a chunk of memory; the former obviously has it
already mapped (you don't get kernel addresses otherwise), the latter doesn't
need to.

iov_iter instances might be backed by different things, including
arrays of kvec (iov_iter_kvec() constructs such) and arrays of
bio_vec (iov_iter_bvec() is the constructor for those).

iov_iter primitives (copy_to_iter/copy_from_iter/copy_page_to_iter/etc.)
work with either variant - they look at the flavour and act accordingly.

ITER_BVEC ones tend to do that kmap_local_page() + copy + kunmap_local().
ITER_KVEC obviously use memcpy() for copying and that's it.

If you need e.g. to send some subranges of some pages you could kmap them,
form kvec array, make msg.msg_iter a KVEC-backed iterator over those and
feed it to sendmsg(). Or you could take a bio_vec array instead, make
msg.msg_iter a BVEC-backed iterator over that and feed to sendmsg().

The difference is, in the latter case kmap_local() will be done on demand
*inside* ->sendmsg() instance, when it gets around to copying some data
from the source and calls something like csum_and_copy_from_iter() or
whichever primitive it chooses to use.

Why bother with mapping the damn thing in the caller and having it pinned
all along whatever ->sendmsg() you end up calling? Just give it
page/offset/length instead of address/length and let lib/iov_iter.c
do the right thing...

Thanks for the info Al. IIRC, I think that the original goal of this
code was to avoid the kmap/kunmap overhead on every copy as the buffers
lifetime matches the request itself, However as noted in this thread,
the buffers are always highmem, and hence don't have an associated
overhead.

I agree that this can be converted to use bio_vec, I'll give it a try.