Re: [smartmontools-support] inactive SATA drives won't stay in standbyor sleep, PATA models did. (fwd)

From: Linda Walsh
Date: Mon Oct 06 2008 - 21:36:26 EST


Controller is a Promise TX4/300
Is this what you were looking for?:

Oct 6 16:59:14 ish kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Oct 6 16:59:14 ish kernel: ata2.00: cmd b0/da:00:00:4f:c2/00:00:00:00:00/00 tag 0
Oct 6 16:59:14 ish kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Oct 6 16:59:14 ish kernel: ata2.00: status: { DRDY }
Oct 6 16:59:20 ish kernel: ata2: link is slow to respond, please be patient (ready=-19)
Oct 6 16:59:24 ish kernel: ata2: COMRESET failed (errno=-16)
Oct 6 16:59:30 ish kernel: ata2: link is slow to respond, please be patient (ready=-19)
Oct 6 16:59:34 ish kernel: ata2: COMRESET failed (errno=-16)
Oct 6 16:59:40 ish kernel: ata2: link is slow to respond, please be patient (ready=-19)
Oct 6 17:00:09 ish kernel: ata2: COMRESET failed (errno=-16)
Oct 6 17:00:09 ish kernel: ata2: limiting SATA link speed to 1.5 Gbps
Oct 6 17:00:14 ish dhcpd: Forward map from ns1.sc.tlinx.org to 192.168.3.242 FAILED: Has an A record but no DHCID, not mine.
Oct 6 17:00:15 ish kernel: ata2: COMRESET failed (errno=-16)
Oct 6 17:00:15 ish kernel: ata2: reset failed, giving up
Oct 6 17:00:15 ish kernel: ata2.00: disabled
Oct 6 17:00:15 ish kernel: ata2: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0xe frozen t4
Oct 6 17:00:15 ish kernel: ata2: hotplug_status 0x22
Oct 6 17:00:20 ish kernel: ata2: link is slow to respond, please be patient (ready=-19)
Oct 6 17:00:25 ish kernel: ata2: COMRESET failed (errno=-16)
Oct 6 17:00:30 ish kernel: ata2: link is slow to respond, please be patient (ready=-19)
Oct 6 17:00:35 ish kernel: ata2: COMRESET failed (errno=-16)
Oct 6 17:00:40 ish kernel: ata2: link is slow to respond, please be patient (ready=-19)
Oct 6 17:01:10 ish kernel: ata2: COMRESET failed (errno=-16)
Oct 6 17:01:10 ish kernel: ata2: limiting SATA link speed to 1.5 Gbps
Oct 6 17:01:15 ish kernel: ata2: COMRESET failed (errno=-16)
Oct 6 17:01:15 ish kernel: ata2: reset failed, giving up
Oct 6 17:01:15 ish kernel: ata2: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0xe frozen t3
Oct 6 17:01:15 ish kernel: ata2: hotplug_status 0x22
Oct 6 17:01:20 ish kernel: ata2: link is slow to respond, please be patient (ready=-19)
Oct 6 17:01:25 ish kernel: ata2: COMRESET failed (errno=-16)
Oct 6 17:01:30 ish kernel: ata2: link is slow to respond, please be patient (ready=-19)
Oct 6 17:01:35 ish kernel: ata2: COMRESET failed (errno=-16)
Oct 6 17:01:40 ish kernel: ata2: link is slow to respond, please be patient (ready=-19)
Oct 6 17:02:10 ish kernel: ata2: COMRESET failed (errno=-16)
Oct 6 17:02:10 ish kernel: ata2: limiting SATA link speed to 1.5 Gbps
Oct 6 17:02:15 ish kernel: ata2: COMRESET failed (errno=-16)
Oct 6 17:02:15 ish kernel: ata2: reset failed, giving up
Oct 6 17:02:15 ish kernel: ata2: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0xe frozen t2
Oct 6 17:02:15 ish kernel: ata2: hotplug_status 0x22
Oct 6 17:02:21 ish kernel: ata2: link is slow to respond, please be patient (ready=-19)
Oct 6 17:02:25 ish kernel: ata2: COMRESET failed (errno=-16)
Oct 6 17:02:31 ish kernel: ata2: link is slow to respond, please be patient (ready=-19)
Oct 6 17:02:35 ish kernel: ata2: COMRESET failed (errno=-16)
Oct 6 17:02:41 ish kernel: ata2: link is slow to respond, please be patient (ready=-19)
Oct 6 17:03:01 ish sshd[4020]: error: channel 0: chan_read_failed for istate 3
Oct 6 17:03:10 ish syslog-ng[13177]: last message repeated 2 times
Oct 6 17:03:10 ish kernel: ata2: COMRESET failed (errno=-16)
Oct 6 17:03:10 ish kernel: ata2: limiting SATA link speed to 1.5 Gbps
Oct 6 17:03:15 ish kernel: ata2: COMRESET failed (errno=-16)
Oct 6 17:03:15 ish kernel: ata2: reset failed, giving up
Oct 6 17:03:15 ish kernel: ata2: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0xe frozen t1
Oct 6 17:03:15 ish kernel: ata2: hotplug_status 0x22
Oct 6 17:03:21 ish kernel: ata2: link is slow to respond, please be patient (ready=-19)
Oct 6 17:03:25 ish kernel: ata2: COMRESET failed (errno=-16)
Oct 6 17:03:31 ish kernel: ata2: link is slow to respond, please be patient (ready=-19)
Oct 6 17:03:35 ish kernel: ata2: COMRESET failed (errno=-16)
Oct 6 17:03:41 ish kernel: ata2: link is slow to respond, please be patient (ready=-19)
Oct 6 17:04:10 ish kernel: ata2: COMRESET failed (errno=-16)
Oct 6 17:04:10 ish kernel: ata2: limiting SATA link speed to 1.5 Gbps
Oct 6 17:04:15 ish kernel: ata2: COMRESET failed (errno=-16)
Oct 6 17:04:15 ish kernel: ata2: reset failed, giving up
Oct 6 17:04:15 ish kernel: ata2: EH pending after 5 tries, giving up
Oct 6 17:04:15 ish kernel: sd 2:0:0:0: rejecting I/O to offline device
Oct 6 17:04:15 ish kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
Oct 6 17:04:15 ish kernel: sd 2:0:0:0: [sdc] START_STOP FAILED
Oct 6 17:04:33 ish kernel: Filesystem "sdc1": xfs_log_force: error 5 returned.
Oct 6 17:04:33 ish kernel: Filesystem "sdc1": xfs_log_force: error 5 returned.
Oct 6 17:05:45 ish kernel: Filesystem "sdc1": xfs_log_force: error 5 returned.
Oct 6 17:05:45 ish kernel: Filesystem "sdc1": xfs_log_force: error 5 returned.
Oct 6 17:06:31 ish kernel: Filesystem "sdc1": xfs_log_force: error 5 returned.
Oct 6 17:07:30 ish syslog-ng[13177]: last message repeated 2 times
Oct 6 17:07:33 ish kernel: Filesystem "sdc1": xfs_log_force: error 5 returned.
Oct 6 17:07:33 ish kernel: Filesystem "sdc1": xfs_log_force: error 5 returned.
Oct 6 17:08:32 ish kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Oct 6 17:08:32 ish kernel: ata1.00: cmd b0/da:00:00:4f:c2/00:00:00:00:00/00 tag 0
Oct 6 17:08:32 ish kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Oct 6 17:08:32 ish kernel: ata1.00: status: { DRDY }
Oct 6 17:08:38 ish kernel: ata1: link is slow to respond, please be patient (ready=-19)
Oct 6 17:08:42 ish kernel: ata1: COMRESET failed (errno=-16)
Oct 6 17:08:45 ish kernel: Filesystem "sdc1": xfs_log_force: error 5 returned.
Oct 6 17:08:48 ish kernel: ata1: link is slow to respond, please be patient (ready=-19)
Oct 6 17:08:52 ish kernel: ata1: COMRESET failed (errno=-16)
Oct 6 17:08:58 ish kernel: ata1: link is slow to respond, please be patient (ready=-19)
Oct 6 17:09:21 ish kernel: Filesystem "sdc1": xfs_log_force: error 5 returned.
Oct 6 17:09:27 ish kernel: ata1: COMRESET failed (errno=-16)
Oct 6 17:09:27 ish kernel: ata1: limiting SATA link speed to 1.5 Gbps
Oct 6 17:09:32 ish kernel: ata1: COMRESET failed (errno=-16)
Oct 6 17:09:32 ish kernel: ata1: reset failed, giving up
Oct 6 17:09:32 ish kernel: ata1.00: disabled
Oct 6 17:09:32 ish kernel: ata1: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0xe frozen t4
Oct 6 17:09:32 ish kernel: ata1: hotplug_status 0x88
Oct 6 17:09:38 ish kernel: ata1: link is slow to respond, please be patient (ready=-19)
Oct 6 17:09:42 ish kernel: ata1: COMRESET failed (errno=-16)
Oct 6 17:09:48 ish kernel: ata1: link is slow to respond, please be patient (ready=-19)
Oct 6 17:09:52 ish kernel: ata1: COMRESET failed (errno=-16)
Oct 6 17:09:57 ish kernel: Filesystem "sdc1": xfs_log_force: error 5 returned.
Oct 6 17:09:58 ish kernel: ata1: link is slow to respond, please be patient (ready=-19)
Oct 6 17:10:27 ish kernel: ata1: COMRESET failed (errno=-16)
Oct 6 17:10:27 ish kernel: ata1: limiting SATA link speed to 1.5 Gbps
Oct 6 17:10:32 ish kernel: ata1: COMRESET failed (errno=-16)
Oct 6 17:10:32 ish kernel: ata1: reset failed, giving up
Oct 6 17:10:32 ish kernel: ata1: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0xe frozen t3
Oct 6 17:10:32 ish kernel: ata1: hotplug_status 0x88
Oct 6 17:10:33 ish kernel: Filesystem "sdc1": xfs_log_force: error 5 returned.
Oct 6 17:10:38 ish kernel: ata1: link is slow to respond, please be patient (ready=-19)
Oct 6 17:10:42 ish kernel: ata1: COMRESET failed (errno=-16)
Oct 6 17:10:48 ish kernel: ata1: link is slow to respond, please be patient (ready=-19)
Oct 6 17:10:52 ish kernel: ata1: COMRESET failed (errno=-16)
Oct 6 17:10:58 ish kernel: ata1: link is slow to respond, please be patient (ready=-19)
Oct 6 17:11:09 ish kernel: Filesystem "sdc1": xfs_log_force: error 5 returned.
Oct 6 17:11:27 ish kernel: ata1: COMRESET failed (errno=-16)
Oct 6 17:11:27 ish kernel: ata1: limiting SATA link speed to 1.5 Gbps
Oct 6 17:11:33 ish kernel: ata1: COMRESET failed (errno=-16)
Oct 6 17:11:33 ish kernel: ata1: reset failed, giving up
Oct 6 17:11:33 ish kernel: ata1: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0xe frozen t2
Oct 6 17:11:33 ish kernel: ata1: hotplug_status 0x88
Oct 6 17:11:38 ish kernel: ata1: link is slow to respond, please be patient (ready=-19)
Oct 6 17:11:43 ish kernel: ata1: COMRESET failed (errno=-16)
Oct 6 17:11:45 ish kernel: Filesystem "sdc1": xfs_log_force: error 5 returned.
Oct 6 17:11:48 ish kernel: ata1: link is slow to respond, please be patient (ready=-19)
Oct 6 17:11:53 ish kernel: ata1: COMRESET failed (errno=-16)
Oct 6 17:11:58 ish kernel: ata1: link is slow to respond, please be patient (ready=-19)
Oct 6 17:12:17 ish xinetd[2021]: Exiting...
Oct 6 17:12:17 ish kernel: nfsd: last server has exited
Oct 6 17:12:17 ish kernel: nfsd: unexporting all filesystems
Oct 6 17:12:17 ish apcupsd[1989]: apcupsd exiting, signal 15
Oct 6 17:12:17 ish apcupsd[1989]: apcupsd shutdown succeeded
Oct 6 17:12:17 ish rpc.statd[2074]: Caught signal 15, un-registering and exiting.
Oct 6 17:12:17 ish mountd[2075]: Caught signal 15, un-registering and exiting.
Oct 6 17:12:21 ish kernel: Filesystem "sdc1": xfs_log_force: error 5 returned.
Tejun Heo wrote:
Linda Walsh wrote:
So the real problem is why issuing a smart command isn't re-starting
the drive -- or bringing it back from standby. Whereas a "normal" disk
read seems to bring it back to normal functioning just fine (and can
then do the smart-test).

Does this give anyone ideas about where the problem might be? Also
sorta explains why my hangs have been infrequent, because I've been
periodically polling the temps of all the drives -- and only when I stop
the polling would the drive timeout, then die the next morning when
smartd tried to run a short test between 1 and 2 am.

Sounds like a firmware problem to me. Issuing ATA_CMD_VERIFY on block
0 before issuing test commands should work around the problem. Also,
which controller are you using? Can you post the failing kernel log?


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/