Re: Overagressive failing of disk reads, both LIBATA and IDE

From: Mark Lord
Date: Sat Mar 21 2009 - 11:22:52 EST

Next message: Ingo Molnar: "Re: [PATCH] trace/ring_buffer: fix section mismatch warning"
Previous message: Li Zefan: "[tip:tracing/blktrace] blktrace: remove sysfs_blk_trace_enable_show/store()"
In reply to: James Bottomley: "Re: Overagressive failing of disk reads, both LIBATA and IDE"
Next in thread: Alan Cox: "Re: Overagressive failing of disk reads, both LIBATA and IDE"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

James Bottomley wrote:

On Sat, 2009-03-21 at 10:55 -0400, Mark Lord wrote:

The patch *does* use the disk supplied data about the error,
and returns success for sectors up to that point. Where it differs
from mainline SCSI, is that it then continues attempting the remaining
2000 sectors (or whatever) of the request, hoping that not all of
them are bad.

Um, but so does SCSI without your patch ... that was my point.

..

Does it? I thought it still just failed everything after the first
bad sector? Kudos are due if that's working now.

..

I don't really think we'd do that. The problem, as you say is request
combination. I think if we really wanted to do this, we'd have block do
it. Each separate request that's merged gets a separate bio, and block
already has capabilities to pick up per bio errors, so we'd do the
partial completion of the failing bio then skip to the next one in the
request to try. That would completely solve both readahead problems and
request merging ones.

..

Yeah, that's a reasonable way to tackle. And you're right, we *did* discuss
this back two years ago. It just never made it as far as new code. :)

Something else that might be good here, would be to have the md layer
pass down a (per-bio?) flag indicating whether it has redundacy capability
or not for the I/O. Eg. healthy RAID1/4/5/10 etc.. would set the flag,
and SCSI could then just abort immediately on a bad sector, with NO retries
beyond the first bad one.

On RAID0, or a degraded (no spares) RAID1 etc, it would not set the flag,
so SCSI would try harder to recover the data, as we're discussing above.

This sounds like FAST_FAIL, but is different. And the hint needs to
come from the upper layer that is performing redundancy.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Ingo Molnar: "Re: [PATCH] trace/ring_buffer: fix section mismatch warning"
Previous message: Li Zefan: "[tip:tracing/blktrace] blktrace: remove sysfs_blk_trace_enable_show/store()"
In reply to: James Bottomley: "Re: Overagressive failing of disk reads, both LIBATA and IDE"
Next in thread: Alan Cox: "Re: Overagressive failing of disk reads, both LIBATA and IDE"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]