Re: [PATCH v5 3/9] iov_iter: Use IOCB/IOMAP_WRITE if available rather than iterator direction

From: Bart Van Assche
Date: Thu Jan 12 2023 - 16:58:14 EST


On 1/12/23 09:37, Al Viro wrote:
On Thu, Jan 12, 2023 at 06:08:14AM -0800, Christoph Hellwig wrote:
On Thu, Jan 12, 2023 at 10:31:01AM +0000, David Howells wrote:
And use the information in the request for this one (see patch below),
and then move this patch first in the series, add an explicit direction
parameter in the gup_flags to the get/pin helper and drop iov_iter_rw
and the whole confusing source/dest information in the iov_iter entirely,
which is a really nice big tree wide cleanup that remove redundant
information.

Fine by me, but Al might object as I think he wanted the internal checks. Al?

I'm happy to have another discussion, but the fact the information in
the iov_iter is 98% redundant and various callers got it wrong and
away is a pretty good sign that we should drop this information. It
also nicely simplified the API.

I have no problem with getting rid of iov_iter_rw(), but I would really like to
keep ->data_source. If nothing else, any place getting direction wrong is
a trouble waiting to happen - something that is currently dealing only with
iovec and bvec might be given e.g. a pipe.

Speaking of which, I would really like to get rid of the kludge /dev/sg is
pulling - right now from-device requests there do the following:
* copy the entire destination in (and better hope that nothing is mapped
write-only, etc.)
* form a request + bio, attach the pages with the destination copy to it
* submit
* copy the damn thing back to destination after the completion.
The reason for that is (quoted in commit ecb554a846f8)

====
The semantics of SG_DXFER_TO_FROM_DEV were:
- copy user space buffer to kernel (LLD) buffer
- do SCSI command which is assumed to be of the DATA_IN
(data from device) variety. This would overwrite
some or all of the kernel buffer
- copy kernel (LLD) buffer back to the user space.
The idea was to detect short reads by filling the original
user space buffer with some marker bytes ("0xec" it would
seem in this report). The "resid" value is a better way
of detecting short reads but that was only added this century
and requires co-operation from the LLD.
====

IOW, we can't tell how much do we actually want to copy out, unless the SCSI driver
in question is recent enough. Note that the above had been written in 2009, so
it might not be an issue these days.

Do we still have SCSI drivers that would not set the residual on bypass requests
completion? Because I would obviously very much prefer to get rid of that
copy in-overwrite-copy out thing there - given the accurate information about
the transfer length it would be easy to do.

(+Martin and Doug)

I'm not sure that we still need the double copy in the sg driver. It seems obscure to me that there is user space software that relies on finding "0xec" in bytes not originating from a SCSI device. Additionally, SCSI drivers that do not support residuals should be something from the past.

Others may be better qualified to comment on this topic.

Bart.