Re: [bug] ata subsystem related crash with latest -git

From: Jeff Garzik
Date: Thu Oct 18 2007 - 03:30:38 EST


Jens Axboe wrote:
On Thu, Oct 18 2007, Jeff Garzik wrote:
Mark Lord wrote:
Mark Lord wrote:
Linus Torvalds wrote:
On Wed, 17 Oct 2007, Mark Lord wrote:
It would be good to have something soon-ish.
This "dead at boot time" issue is impacting the general ability to test
patches against latest -git in time for the current merge window.
In the meantime, does the patch I sent out help people?
Your patch from this posting http://lkml.org/lkml/2007/10/17/285
does not seem to make much difference here.

It still crashes at exactly the same place.
However, Jens's patch from that same thread:
http://lkml.org/lkml/2007/10/17/269
..allowed me to boot and post this followup message from -git12
Jeff: try that one.
That's already in my upstream kernel, here. commits
ba951841ceb7fa5b06ad48caa5270cc2ae17941e and a3bec5c5aea0da263111c4d8f8eabc1f8560d7bf.

sata_mv and sata_nv still reliably poop themselves here, whereas its rock solid with 2.6.23.1. Sounds like different issues from yours, as I see a stream of SATA errors on the bad kernels, errors which are often a symptom of something whacked in the DMA engine (misprogramming causes the silicon to generate bogus FIS's, which the device then chokes on)

Do you know if this poop involves the segment padding that sometimes
goes on in libata?

Definitely not, in this case -- it's all ATA, nothing ATAPI.

It throws SError { Handshk } which then triggers the EH to reset the link, and so it goes, over and over :) The same thing happens when I intentionally screw up the PRD tables. Not much more data points than that, so far.

I'll try the SCSI_MAX_SG_SEGMENTS patch too, to see if that fixes things.

Jeff



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/