Spurious Filesystem corruption with ext3 + large (<400GB) hw RAID partition + SMP

From: Manfred Wassmann
Date: Sat Jan 31 2009 - 08:36:39 EST


Hello,

I'm experiencing spurious filesystem corruption problems with ext3 on
a partition larger than 400 GB on a hardware SATA RAID.

The problems first occured on an Intel dual core system with a custom
installation DVD which uses a 32 bit 2.6.26 kernel with SMP enabled
and a 4GB memory limit. The problems occured only randomly but it was
possible to reproduce them reliably with a standard Ubunu installation
disk using a 2.6.27-7 kernel (apparently also 32bit SMP kernel w 4GB
limit).
When installing Ubuntu a message appears in the kernel log stating:
"EXT3-fs error (device sda6): ext3_vlid_block_bitmap: Invalid block
bitmap - block_group = 3076, block = <some large number>"
This message is the first indication of any problem, the filesystem is
then remounted readonly and the installation stops understandably.

The problem does only occur with an ext3 filesystem, installation
using an xfs filesystem completes without error.

Furthermore with the installation from my custom installation DVD
succeeded, which installs a prebuild Debian etch system featuring a 64
bit monolithic 2.6.26 kernel, i was able to stress the system to its
limits using the stress and dbench programs without any sign of
filesystem corruption.

As XFS does substantially outperform ext3 under the given
circumstances I'm going to use that but if someone wants to
investigate the problem and needs further details I'll do my very best
;-)

regards Manfred Wassmann
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/