Re: [patch] document flash/RAID dangers

From: Pavel Machek
Date: Sat Aug 29 2009 - 06:19:15 EST


>> Example I seen went like this:
>>
>> Drive in raid 5 failed; hot spare was available (no idea about
>> UPS). System apparently locked up trying to talk to the failed drive,
>> or maybe admin just was not patient enough, so he just powercycled the
>> array. He lost the array.
>>
>> So while most people will not agressively powercycle the RAID array,
>> drive failure still provokes little tested error paths, and getting
>> unclean shutdown is quite easy in such case.
>
> Then what we need to document is do not power cycle an array during a
> rebuild, right?

Yep, that and the fact that you should fsck if you do.

> If it wasn't the admin that timed out and the box really was hung (no
> drive activity lights, etc), you will need to power cycle/reboot but
> then you should not have this active rebuild issuing writes either...

Ok, I guess you are right here.
Pavel

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/