Re: [man-pages RFC PATCH v4] statx, inode: document the new STATX_INO_VERSION field

From: Theodore Ts'o
Date: Tue Sep 13 2022 - 05:39:43 EST


On Tue, Sep 13, 2022 at 01:30:58PM +1000, NeilBrown wrote:
> On Tue, 13 Sep 2022, Dave Chinner wrote:
> >
> > Indeed, we know there are many systems out there that mount a
> > filesystem, preallocate and map the blocks that are allocated to a
> > large file, unmount the filesysetm, mmap the ranges of the block
> > device and pass them to RDMA hardware, then have sensor arrays rdma
> > data directly into the block device.....
>
> And this tool doesn't update the i_version? Sounds like a bug.

Tools that do this include "grub" and "lilo". Fortunately, most
people aren't trying to export their /boot directory over NFS. :-P

That being said, all we can strive for is "good enough" and not
"perfection". So if I were to add a "crash counter" to the ext4
superblock, I can make sure it gets incremented (a) whenever the
journal is replayed (assuming that we decide to use lazytime-style
update for i_version for performance reasons), or (b) when fsck needs
to fix some file system inconsistency, or (c) when some external tool
like debugfs or fuse2fs is modifying the file system.

Will this get *everything*? No. For example, in addition Linux boot
loaders, there might be userspace which uses FIEMAP to get the
physical blocks #'s for a file, and then reads and writes to those
blocks using a kernel-bypass interface for high-speed SSDs, for
example. I happen to know of thousands of machines that are doing
this with ext4 in production today, so this isn't hypothetical
example; fortuntely, they aren't exporting their file system over NFS,
nor are they likely to do so. :-)

- Ted