Re: [PATCH 04/11] block: Add bio_for_each_thp_segment_all

From: Christoph Hellwig
Date: Tue Sep 01 2020 - 10:50:49 EST


On Tue, Sep 01, 2020 at 02:05:25PM +0100, Matthew Wilcox wrote:
> > > struct page *page = bvec->bv_page;
> > >
> > > while (length > 0) {
> > > size_t count = thp_size(page) - offset;
> > >
> > > if (count > length)
> > > count = length;
> > > iomap_read_page_end_io(page, offset, count, error);
> > > page += (offset + count) / PAGE_SIZE;
> >
> > Shouldn't the page_size here be thp_size?
>
> No. Let's suppose we have a 20kB I/O which starts on a page boundary and
> the first page is order-2. To get from the first head page to the second
> page, we need to add 4, which is 16kB / 4kB, not 16kB / 16kB.

True.

> I'm not entirely sure the bvec would shrink. On 64-bit systems, it's
> currently 8 bytes for the struct page, 4 bytes for the len and 4 bytes
> for the offset. Sure, we can get rid of the offset, but the compiler
> will just pad the struct from 12 bytes back to 16. On 32-bit systems
> with 32-bit phys_addr_t, we go from 12 bytes down to 8, but most 32-bit
> systems have a 64-bit phys_addr_t these days, don't they?

Actually on those system that still are 32-bit because they are so
tiny I'd very much still expect a 32-bit phys_addr_t. E.g. arm
without LPAE or 32-bit RISC-V.

But yeah, point taken on the alignment for the 64-bit ones.

> That's a bit more boilerplate than I'd like, but if bio_vec is going to
> lose its bv_page then I don't see a better way. Unless we come up with
> a different page/offset/length struct that bio_vecs are decomposed into.

I'm not sure it is going to lose bv_page any time soon. I'd sure like
to, but least time something like that came up Linus wasn't entirely
in favor. Things might have changed now, though and I think it is about
time to give it another try.