Re: [PATCH v14 090/138] block: Add bio_add_folio()

From: Matthew Wilcox
Date: Thu Jul 15 2021 - 18:28:36 EST


On Thu, Jul 15, 2021 at 01:59:17PM -0700, Darrick J. Wong wrote:
> On Thu, Jul 15, 2021 at 04:36:16AM +0100, Matthew Wilcox (Oracle) wrote:
> > This is a thin wrapper around bio_add_page(). The main advantage here
> > is the documentation that the submitter can expect to see folios in the
> > completion handler, and that stupidly large folios are not supported.
> > It's not currently possible to allocate stupidly large folios, but if
> > it ever becomes possible, this function will fail gracefully instead of
> > doing I/O to the wrong bytes.
> >
> > Signed-off-by: Matthew Wilcox (Oracle) <willy@xxxxxxxxxxxxx>
> > ---
> > block/bio.c | 21 +++++++++++++++++++++
> > include/linux/bio.h | 3 ++-
> > 2 files changed, 23 insertions(+), 1 deletion(-)
> >
> > diff --git a/block/bio.c b/block/bio.c
> > index 1fab762e079b..1b500611d25c 100644
> > --- a/block/bio.c
> > +++ b/block/bio.c
> > @@ -933,6 +933,27 @@ int bio_add_page(struct bio *bio, struct page *page,
> > }
> > EXPORT_SYMBOL(bio_add_page);
> >
> > +/**
> > + * bio_add_folio - Attempt to add part of a folio to a bio.
> > + * @bio: Bio to add to.
> > + * @folio: Folio to add.
> > + * @len: How many bytes from the folio to add.
> > + * @off: First byte in this folio to add.
> > + *
> > + * Always uses the head page of the folio in the bio. If a submitter
> > + * only uses bio_add_folio(), it can count on never seeing tail pages
> > + * in the completion routine. BIOs do not support folios larger than 2GiB.
> > + *
> > + * Return: The number of bytes from this folio added to the bio.
> > + */
> > +size_t bio_add_folio(struct bio *bio, struct folio *folio, size_t len,
> > + size_t off)
> > +{
> > + if (len > UINT_MAX || off > UINT_MAX)
>
> Er... if bios don't support folios larger than 2GB, then why check @off
> and @len against UINT_MAX, which is ~4GB?

I suppose that's mostly a documentation problem. The limit is:

struct bio_vec {
struct page *bv_page;
unsigned int bv_len;
unsigned int bv_offset;
};

so we can support folios which are 2GB in size (0x8000'0000) bytes, but
if (theoretically, some day) we had a 4GB folio, we wouldn't be able to
handle it in a single bio_vec. So there isn't anything between a 2GB
folio and a 4GB folio; a 4GB folio isn't allowed and a 2GB one is.

I don't think anything's wrong here, but maybe things could be worded
better?