Re: [PATCH 0/3] mm: Swap checksum

From: Valdis . Kletnieks
Date: Wed May 26 2010 - 17:29:14 EST


On Thu, 27 May 2010 00:31:44 +0900, Minchan Kim said:
> On Wed, May 26, 2010 at 07:21:57AM -0300, Cesar Eduardo Barros wrote:
> > far as I can see, does nothing against the disk simply failing to
> > write and later returning stale data, since the stale checksum would
> > match the stale data.
>
> Sorry. I can't understand your point.
> Who makes stale data? If any layer makes data as stale, integrity is up to
> the layer. Maybe I am missing your point.
> Could you explain more detail?

I'm pretty sure that what Cesar meant was that the following could happen:

1) Write block 11983 on the disk, checksum 34FE9B72.
(... time passes.. maybe weeks)
2) Attempt to write block 11983 on disk with checksum AE9F3581. The write fails
due to a power failure or something.
(... more time passes...)
3) Read block 11983, get back data with checksum 34FE9B72. Checksum matches,
and there's no indication that the write in (2) ever failed. The program
proceeds thinking it's just read back the most recently written data, when in
fact it's just read an older version of that block. Problems can ensue if the
data just read is now out of sync with *other* blocks of data - instant data
corruption.

To be fair, we currently have the "read a stale block" problem after crashes
already. The issue is that BLK_DEV_INTEGRITY can't provide a solution here,
but most users will form a mental image that it *is* in fact giving them
that guarantee. The resulting mismatch between reality and expectations
cannot end well.



Attachment: pgp00000.pgp
Description: PGP signature