Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

From: Jun'ichi Nomura
Date: Mon Oct 29 2012 - 07:07:00 EST


On 10/29/12 19:37, Andi Kleen wrote:
> Theodore Ts'o <tytso@xxxxxxx> writes:
>> On Mon, Oct 29, 2012 at 12:16:32PM +1100, Dave Chinner wrote:
>>> Except that there are filesystems that cannot implement such flags,
>>> or require on-disk format changes to add more of those flags. This
>>> is most definitely not a filesystem specific behaviour, so any sort
>>> of VFS level per-file state needs to be kept in xattrs, not special
>>> flags. Filesystems are welcome to optimise the storage of such
>>> special xattrs (e.g. down to a single boolean flag in an inode), but
>>> using a flag for something that dould, in fact, storage the exactly
>>> offset and length of the corruption is far better than just storing
>>> a "something is corrupted in this file" bit....
>>
>> Agreed, if we're going to add an xattr, then we might as well store
>
> I don't think an xattr makes sense for this. It's sufficient to keep
> this state in memory.
>
> In general these error paths are hard to test and it's important
> to keep them as simple as possible. Doing IO and other complexities
> just doesn't make sense. Just have the simplest possible path
> that can do the job.

And since it's difficult to prove, I think it's nice to have an
option to panic if the memory error was on dirty page cache.

It's theoretically same as disk I/O error; dirty cache is marked invalid
and next read will go to disk.
Though in practice, the next read will likely to fail if disk was broken.
(Given that transient errors are usually recovered by retries and fail-overs
in storage stack and not visible to applications which don't care.)
So it's "consistent" in some sense.
OTOH, the next read will likely succeed reading old data from disk
in case of the memory error.
I'm afraid the read-after-write inconsistency could cause silent data
corruption.

--
Jun'ichi Nomura, NEC Corporation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/