Brad Campbell <brad@xxxxxxxxxxx> writes:
Greg Stark wrote:
Any clue what I need to do to achieve this? Is this a bug because this isn't a
well-travelled code-path? (Dead drives not being something you can conjure up
on demand)? Or is this indicative of more problems than just a crashed drive?
This is on a stock 2.6.6 kernel tree, btw.
Known issue, fixed in 2.6.9-rc1. Apply this to 2.6.6 and your good to go.
Hm. I'm running 2.6.0-rc1 now. I'm not sure this really fixed the problem.
I get the same message and the same basic symptom -- any process touching the
bad disk goes into disk-wait for a long time. But whereas before as far as I
know they never came out, now they seem to come out of disk-wait after a good
long time. But then maybe I just never waited long enough with 2.6.6.
I do still get the "ATA: abnormal status 0x59 on port 0xEFE7" so if that's a
sign something's wrong then something's still wrong. I also now get additional
messages that I don't recall seeing before. They're included below.
And as I said, every other process touching the drive, even in good areas,
enters disk-wait. If I kill -9 the process generating the errors and wait a
few minutes they seem to eventually exit disk-wait though.
This means I would be able to do the recovery in theory, but in practice it'll
just take an infeasible length of time. I have gigs of data to go through and
at the amount of time it takes to time out after each error it'll take me many
days (years I think) to just to figure out which blocks to avoid.