Re: 2.6.29 regression: ATA bus errors on resume

From: Tejun Heo
Date: Sat Apr 04 2009 - 12:46:24 EST


Hello, Neil.

Niel Lambrechts wrote:
>> Thanks for your help, I've done at least 5 hibernate cycles without the
>> problem recurring, I'll keep at it for a while... :)

Hmmm..

>> For the sake of being thorough, I'd like to mention some of the
>> remaining issues/messages, but to be honest some of them were there
>> before and may not be relevant to your efforts:
>>
>> 1) Can you perhaps confirm if the remaining ATA messages are harmless
>> enough to ignore?
>>
>> dmesg:
>> ata2: exception Emask 0x10 SAct 0x0 SErr 0x4050000 action 0x1e frozen
>> ata2: irq_stat 0x00400040, connection status changed
>> ata2: SError: { PHYRdyChg CommWake DevExch }
>> Clocksource tsc unstable (delta = -412838835 ns)

Yeah, that happens on some machines during resume. It doesn't on some
machines (including mine) and dunno what's the difference yet. It's a
bit annoying and probably adds a bit to resume time but libata handles
things like this just fine, so no need to be alarmed too much.

>> and in messages:
>> Apr 2 12:31:44 linux-7vph kernel: ata1: exception Emask 0x10 SAct 0x0
>> SErr 0x0 action 0x9 t4
>>
>> 2) The screen remains blank on resume, right until I both press a key
>> _and_ touch the touchpad. Weird, but this happens in 2.6.28.9 as well,
>> perhaps this is i915 related.

Heh... yeah, black magic of display suspend/resume. Hopefully it will
get better with KMS.

> UPDATE:
>
> Just arrived back home, before suspending I enabled all the powertop
> laptop mode/SATA link management etc and I removed a CD-ROM from the
> CD drive so I'm not sure what could be responsible for this:
>
> ata2.00: XXX setting retry on qc0
> ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
> ata2.00: irq_stat 0x40000001
> ata2.00: XXX terminating qc0 (SENSE), retries=0
> XXX scsi_eh_flush_done_q: online=1(2) noretry=0 retries=3 allowed=3
> scsi_eh_1: flush finish cmd: f5dea740
> ata2.00: XXX setting retry on qc0
> ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
> ata2.00: irq_stat 0x40000001
> ata2.00: XXX terminating qc0 (SENSE), retries=0
> XXX scsi_eh_flush_done_q: online=1(2) noretry=0 retries=2 allowed=2
> .. loop ..
>
> I hope it's simply something like "drive not ready" debugging ouput,
> since ata2 seems associated with the CD-Rom drive... :)
>
> This continued, until I again in powertop enabled SATA link management.

That's probably hal polling for media presence. It uses
TEST_UNIT_READY and if there's no media in the drive, it will raise
CHECK_SENSE condition which is handled by the libata EH which is
usually quiet about it as it's not an error but with the debugging
patch, it got more whiny. If you're annoyed by it just put a media in
the drive.

Thanks.

--
tejun

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/