Re: [PATCH] libata error handling fixes (ATAPI)

From: Jeff Garzik
Date: Tue Nov 15 2005 - 04:28:46 EST


Tejun Heo wrote:
I've been trying to do about the same thing and here's my version
which only fixes error/sense handling. This patch makes ahci's
eng_timeout act correctly w.r.t ATAPI sense requesting. libata ATAPI
request sensing mechanism was not broken. What's broken was AHCI's
eng_timeout handler. And, yes, this problem disappears with request
sensing via active qc turn-around in your patch.

Yes, that's intentional. I fixed the problem in AHCI by copying code from libata-core's ata_qc_timeout(), similar to what you did. Didn't like that solution at all, so I started over from scratch.


As I've written several times, I'm not a big fan of sense requesting
by turning around active qc. For libata's EH to work any better,
handing-over failed commands to EH is necessary with or without
request sensing. Hmm... It's true that the current implementation
used for ATAPI request sensing needs improvements. Remember those
patches which implement generic failed command handover and perform
request sensing from EH with proper timeout and synchronization?

We'll see how things shake out in the future.

One of the reasons I haven't responded to your ATA exceptions doc RFC is that I don't like to plan. Linux isn't about roadmaps, it's about what's the next-best-step. I think my current EH fixes are the next best step, as it cleans up the error handler a lot, and gets things working. Next step after that? No idea. I just go on gut feeling I have based on direction.

The current feeling is that we should move away from complex dependencies on the SCSI EH layer, and do all our own error handling (perhaps in ata_wq). This will allow use of libata as a block driver.


Ah.. also, I'm working on sil24 ATAPI support. However, it seems that
I need to do a little bit more; unfortunately, I'm not sure which....
INQUIRY succeeds and then my drive fails the first TUR with SK 6h
(UNIT ATTENTION) which is probably due to previous phy reset. Then,
my drive fails to repsond to following REQUEST SENSE.

With my patched AHCI, REQUEST SENSE works and after SK 06h and several
SK 02h's (NOT READY), it comes online and works okay.

As INQUIRY works okay, my hunch is that I'm not performing TUR
properly (that is ATAPI command with no data) and the drive locks up
after it. I'll fool around with this more and report.

TUR fails for me, too. It triggers the EH code, which is why I wound up needing to fix it :)

Have you dumped the D2H FIS returned by the drive, when the TUR fails? What does it look like? What does the D2H FIS look like after REQUEST SENSE?

I bet there is some "clear error" actions we need to take on sil24, before it work there.

Jeff


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/