Re: [PATCH 0/4] (RESEND) ext3[34] barrier changes

From: Daniel Phillips
Date: Wed May 21 2008 - 18:30:58 EST


On Monday 19 May 2008 10:16, Chris Mason wrote:
> root@opti:~# fsck -f /dev/sda2
> fsck 1.40.8 (13-Mar-2008)
> e2fsck 1.40.8 (13-Mar-2008)
> /dev/sda2: recovering journal
> Pass 1: Checking inodes, blocks, and sizes
> Pass 2: Checking directory structure
> Problem in HTREE directory inode 281377 (/barrier-test): bad block number
> 13543.
> Clear HTree index<y>?

Nice, htree as a canary for disk corruption. This makes sense since
directory data is the only verifiable structure at the logical data
level and htree offers the only large scale, verifiable structure.

Thanks for the lovely test methodology example. Let me additionally
offer this tool:

http://code.google.com/p/zumastor/source/browse/trunk/ddsnap/tests/devspam.c?r=1564
devspam

The idea is to write an efficiently verifiable pattern across a range
of a file, including a mix of position-dependent codes and a user
supplied code. In read mode, devspam will check that the position
dependent codes are correct and match the user supplied code. This
can be easily extended to a "check that all the user supplied codes
are the same" mode, which would help detect consistency failure in
regular data files much as htree does with directories. Hmm, this
probably wants to incorporate a sequence number as well, to detect
corruption under a random block update load as you have triggered
with htree.

I used this tool to exorcise the majority of bugs in ddsnap. It is
a wonderful canary, not only catching bugs early but showing where
where they occurred.

>From what I have seen, Sun seems to rely mostly on MD5 checksums for
detecting corruption in ZFS. We should do more of that too.

Regards,

Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/