Re: [PATCH v1 0/6] no-copy bvec

From: Douglas Gilbert
Date: Thu Dec 24 2020 - 11:48:41 EST


On 2020-12-24 1:41 a.m., Christoph Hellwig wrote:
On Wed, Dec 23, 2020 at 08:32:45PM +0000, Pavel Begunkov wrote:
On 23/12/2020 20:23, Douglas Gilbert wrote:
On 2020-12-23 11:04 a.m., James Bottomley wrote:
On Wed, 2020-12-23 at 15:51 +0000, Christoph Hellwig wrote:
On Wed, Dec 23, 2020 at 12:52:59PM +0000, Pavel Begunkov wrote:
Can scatterlist have 0-len entries? Those are directly translated
into bvecs, e.g. in nvme/target/io-cmd-file.c and
target/target_core_file.c. I've audited most of others by this
moment, they're fine.

For block layer SGLs we should never see them, and for nvme neither.
I think the same is true for the SCSI target code, but please double
check.

Right, no-one ever wants to see a 0-len scatter list entry.?? The reason
is that every driver uses the sgl to program the device DMA engine in
the way NVME does.?? a 0 length sgl would be a dangerous corner case:
some DMA engines would ignore it and others would go haywire, so if we
ever let a 0 length list down into the driver, they'd have to
understand the corner case behaviour of their DMA engine and filter it
accordingly, which is why we disallow them in the upper levels, since
they're effective nops anyway.

When using scatter gather lists at the far end (i.e. on the storage device)
the T10 examples (WRITE SCATTERED and POPULATE TOKEN in SBC-4) explicitly
allow the "number of logical blocks" in their sgl_s to be zero and state
that it is _not_ to be considered an error.

It's fine for my case unless it leaks them out of device driver to the
net/block layer/etc. Is it?

None of the SCSI Command mentions above are supported by Linux,
nevermind mapped to struct scatterlist.


The POPULATE TOKEN / WRITE USING TOKEN pair can be viewed as a subset
of EXTENDED COPY (SPC-4) which also supports "range descriptors". It is
not clear if target_core_xcopy.c supports these range descriptors but
if it did, it would be trying to map them to struct scatterlist objects.

That said, it would be easy to skip the "number of logical blocks" == 0
case when translating range descriptors to sgl_s.

In my ddpt utility (a dd clone) I have generalized skip= and seek= to
optionally take sgl_s. If the last element in one of those sgl_s is
LBAn,0 then it is interpreted as "until the end of that device" which
is further restricted if the other sgl has a "hard" length or count=
is given. The point being a length of 0 can have meaning, a benefit
lost with NVMe's 0-based counts.

Doug Gilbert