Re: NFS hang + umount -f: better behaviour requested.

From: Valdis . Kletnieks
Date: Fri Aug 31 2007 - 11:10:27 EST


On Fri, 31 Aug 2007 16:06:36 +0800, Ian Kent said:
> So, there's a power outage and the UPS had a glitch.

Murphy can get a *lot* more creative than that.

So we'd outgrown the capacity on our UPS and diesel generator, and decided
to replace them. So we schedule downtime for a Saturday. Rather scary, we
had a Sun E10K that had been powered-up for several years, and just as expected,
a good fraction of the 400+ drives it had failed to re-spinup. While recovering
from that, we discovered that although the vast majority of the 400 drives were
either mirrors or raidsets, due to a config error, the boot volume wasn't
mirrored (fortunately, it spun up OK so we dodged the bullet), so we fixed that.

Literally the next Friday, not even a week later, a contractor relocating a
door into our machine room shorted out a sensor circuit in our fire suppression
system, triggering a Halon dump. Of course, no amount of UPS and diesel was
going to save us now, because there was a safety interlock that killed the
power feeds if the Halon dumped. This time, since they'd all been stressed
just a week before, only 2 of the 400+ disks on the E10K failed to spin up.

Guess which two. ;)




Attachment: pgp00000.pgp
Description: PGP signature