Re: raid5 multi-drive-failure and recovery?

From: harik@chaos.ao.net
Date: Mon Dec 23 2002 - 01:29:43 EST

Next message: Martin J. Bligh: "[PATCH] 0/8 Move NUMA-Q support into subarch"
Previous message: Joseph: "Re: USB 2.0 is too slow?"
In reply to: Neil Brown: "Re: raid5 multi-drive-failure and recovery?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Fri, 20 Dec 2002, Neil Brown wrote:

> > fail, catch it in raid5_end_read_request then tag the stripe_head
> > with the device that's failed. If one has already failed, return
> > EIO. This way further reads on the stripe_head will go to the parity
> > disk (until it's eventually freed. One IO error per stripe isn't too
> > harsh a price to pay for disaster recovery)
> >
> > in 2.4.20, I'm at raid5.c:421 where we're about to call md_error.
> > What happens to the bh from that point? Obviously, it's not up-to-date,
> > so when 1 drive fails how does it get re-issued to be pulled from a parity
> > drive to reconstruct it?
>
> Nothing much happens to the bh. It just has Uptodate cleared.
>
> A little later (line 433) we call release_stripe. Once all the active
> IO requests have finished the stripe will be completely released and
> handle_stripe gets called (from raid5d) to handle the stripe.
> It notices there is a block that is being read (bh_read) but that
> isn't uptodate, and so tries to schedule a read. Line 949 notices
> that the block it wants to read is on a failed drive, so it causes all
> blocks to be read in. Once they are read in, handle_stripe gets
> called again, and this time it gets to line 955 where it computes the
> block that you want.

Ok, I was unclear what the retry-magic was. We stall a request on a
failure, mark the drive 'failed' and eventually we notice "Hey, that
wasn't finished" and restart it (on the good drive)

I'll try to toss a prelim patch up in the near future. Is there a
patch for loopback to simulate errors-on-demand? I've already done
drive forensics to recover the disks in question, and I'd much rather
test on a set of 500meg loops rather then a 60gig raid. (For obvious
reasons)

Thanks,

--Dan

application/pgp-signature attachment: stored

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Martin J. Bligh: "[PATCH] 0/8 Move NUMA-Q support into subarch"
Previous message: Joseph: "Re: USB 2.0 is too slow?"
In reply to: Neil Brown: "Re: raid5 multi-drive-failure and recovery?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b29 : Mon Dec 23 2002 - 22:00:31 EST