Re: Something corrupts raid5 disks slightly during reboot

From: Ville Herva
Date: Sat Nov 01 2003 - 03:35:15 EST


On Fri, Oct 31, 2003 at 05:57:33PM -0800, you [Mike Fedyk] wrote:
> On Fri, Oct 31, 2003 at 07:41:30PM -0600, Jeffrey E. Hundstad wrote:
> > Try:
> >
> > hdparm -W0 /dev/hdX
> >
> > for each of your ide drives. This turns off write-caching which is
> > usually a bad thing with ide drives anyway.
> >
>
> Also try installing smartmontools, and run smartmon -a on each of the
> drives. It might tell you one of the drives is going bad...

I am monitoring all my drives with smart constantly, and they haven't shown
any symptoms. The corruption only happens upon reboot, which is a quite
rare event for a server.

Also, I find that smart rarely gives much useful warnings beforehand when a
drive is about to fail. And when the drive fails I usually get a good doze
of UncorrectableErrors into the log, not silent corruption (and I've seen a
lot drives to fail ;( ). Silent corruption is usually caused by the chipset
or driver (seen that, too ;( ), but it has usually happened under stress,
not when nothing much is being written to the drive.


-- v --

v@xxxxxx
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/