Re: ide errors in 7-rc1-mm1 and later

From: Andrew Morton
Date: Wed Jun 09 2004 - 18:53:56 EST


Bartlomiej Zolnierkiewicz <B.Zolnierkiewicz@xxxxxxxxxxxxxx> wrote:
>
> Does journal has checksum or some other protection against failure during
> writing journal to a disk? If not than it still can be screwed even with
> ordered writes if we are unfortunate enough. ;-)

A transaction is written to disk as two synchronous operations: write all
the data, wait on it, write the single commit block, wait on that.

If the commit block were to hit disk before the data then we have a window
in which poweroff+recovery would replay garbage into the filesystem.

So I think we have a bug in the current ext3 barrier implementation - we
need a blk_issue_flush() before submitting the buffer_ordered commit block.


> > > - if you more than > 1 filesystem on the disk (quite likely scenario) it
> > > can happen that barrier (flush) will fail for sector for file from the
> > > other fs and later barrier for this other fs will succeed
> >
> > I don't understand this one.
>
> Flush command can fail for sector which came into disk's write cache
> from some write request for some other fs on the same disk i.e.

Oh, you're referring to actual I/O errors? I tend to think all bets are
off if that happens - make sure we report it to the application, avoid
crashing the kernel and hope that fsck can get the data back...
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/