Re: Kernels v4.9+ cause short reads of block devices

From: Linus Torvalds
Date: Wed Aug 23 2017 - 16:13:15 EST


On Wed, Aug 23, 2017 at 12:53 PM, Doug Nazar <nazard@xxxxxxxx> wrote:
>
> It's compiling now, but I think it's already set to MAX_LFS_FILESIZE.
>
> [ 169.095127] ppos=80180006000, s_maxbytes=7ffffffffff, magic=0x62646576,
> type=bdev

Oh, right you are - I'm much too used to 64-bit, where
MAX_LFS_FILESIZE is basically infinite, and was jusr assuming that it
was something like the UFS bug we had not that long ago that was due
to the 32-bit limit.

But yes, on 32-bit, we are limited by the 32-bit index into the page
cache, and we limit the index to 31 bits too, so we have (PAGE_SIZE <<
31) -1, which is that 7ffffffffff.

And that also explains why people haven't seen it. You do need

(a) 32-bit environment

(b) a disk larger than that 8TB in size

The *hard* limit for the page cache on a 32-bit environment should
actually be (PAGE_SIZE << 32)-PAGE_SIZE (that final PAGE_SIZE
subtraction is to make sure we don't generate that page cache with
index -1), so having a disk that is 16TB or larger is not going to
work, but your disk is right in that 8TB-16TB hole that used to work
and was broken by that check.

Anyway, that makes me feel better. I should have looked at your disk
size more, now I at least understand why nobody noticed before.

So just throw away my patch. That's wrong, and garbage.

The *right* patch is likely to just this instead:

-#define MAX_LFS_FILESIZE (((loff_t)PAGE_SIZE << (BITS_PER_LONG-1))-1)
+#define MAX_LFS_FILESIZE (((loff_t)PAGE_SIZE <<
BITS_PER_LONG)-PAGE_SIZE)

which should make MAX_LFS_FILESIZE be 0xffffffff000 and you disk size
should be ok.

Linus