Re: [PATCH v4 12/39] netfs: Add iov_iters to (sub)requests to describe various buffers

From: David Howells
Date: Tue Dec 19 2023 - 09:41:19 EST


David Howells <dhowells@xxxxxxxxxx> wrote:

> > > @@ -88,6 +78,11 @@ static void netfs_read_from_server(struct netfs_io_request *rreq,
> > > struct netfs_io_subrequest *subreq)
> > > {
> > > netfs_stat(&netfs_n_rh_download);
> > > + if (iov_iter_count(&subreq->io_iter) != subreq->len - subreq->transferred)
> > > + pr_warn("R=%08x[%u] ITER PRE-MISMATCH %zx != %zx-%zx %lx\n",
> > > + rreq->debug_id, subreq->debug_index,
> > > + iov_iter_count(&subreq->io_iter), subreq->len,
> > > + subreq->transferred, subreq->flags);
> >
> > pr_warn is a bit alarmist, esp given the cryptic message. Maybe demote
> > this to INFO or DEBUG?
> >
> > Does this indicate a bug in the client or that the server is sending us
> > malformed frames?
>
> Good question. The network filesystem updated subreq->transferred to indicate
> it had transferred X amount of data, but the iterator had been updated to
> indicate Y amount of data was transferred. They really ought to match as it
> may otherwise indicate an underrun (and potential leakage of old data).
> Overruns are less of a problem since the iterator would have to 'go negative'
> as it were.
>
> However, it might be better just to leave io_iter unchecked since we end up
> resetting it anyway each time we reinvoke the ->issue_read() op. It's always
> possible that it will get copied and a different iterator get passed to the
> network layer or cache fs - and so the change to the iterator then has to be
> manually propagated just to avoid the warning.

Actually, it's more complicated than that. It's an assertion that netfslib is
doing the right prep. This assertion is checked both when we initially make a
request (in which case it definitely shouldn't fire) and when we perform a
resubmission on partial/complete read failure when we need to carefully
revalidate the numbers to make sure we don't end up with holes or wrinkles in
the buffer.

Anyway, it shouldn't happen - but if it does, it probably presages data
corruption.

David