Re: Lots of con-current I/O = resets SATA link? (2.6.25.10)

From: Gerhard Wiesinger
Date: Mon Jul 07 2008 - 11:21:33 EST


Hello!

Missing logs attached ...

Ciao,
Gerhard

--
http://www.wiesinger.com/


On Mon, 7 Jul 2008, Gerhard Wiesinger wrote:

Hello!

I'm having a similar problem with a brand new Hardware under Fedora 9 x64
8GB RAM
Motherboard: ASUS M3N-H/HDMI
Chipset: NForce 8300/Nvidia 750a
CPU: AMD AM2 5600+, 2.9GHz, Brisbane Dual Core
Kernel: 2.6.25.9-76.fc9.x86_64
Smartmontools: smartmontools-5.38-2.fc9.x86_64
BIOS AHCI mode
Power cables for ata3 and ata4 are on the same cable from an Enermax power supply.

ata1.00: ATA-7: SAMSUNG HD103UJ, 1AA01109, max UDMA7
ata2.00: ATA-7: SAMSUNG HD103UJ, 1AA01109, max UDMA7
ata3.00: ATA-7: SAMSUNG HD103UJ, 1AA01109, max UDMA7
ata4.00: ATA-7: SAMSUNG HD103UJ, 1AA01109, max UDMA7
ata5.00: ATA-7: SAMSUNG HD103UJ, 1AA01109, max UDMA7
ata6.00: ATA-7: SAMSUNG HD103UJ, 1AA01109, max UDMA7
scsi 0:0:0:0: Direct-Access ATA SAMSUNG HD103UJ 1AA0 PQ: 0 ANSI: 5
scsi 1:0:0:0: Direct-Access ATA SAMSUNG HD103UJ 1AA0 PQ: 0 ANSI: 5
scsi 2:0:0:0: Direct-Access ATA SAMSUNG HD103UJ 1AA0 PQ: 0 ANSI: 5
scsi 3:0:0:0: Direct-Access ATA SAMSUNG HD103UJ 1AA0 PQ: 0 ANSI: 5
scsi 4:0:0:0: Direct-Access ATA SAMSUNG HD103UJ 1AA0 PQ: 0 ANSI: 5
scsi 5:0:0:0: Direct-Access ATA SAMSUNG HD103UJ 1AA0 PQ: 0 ANSI: 5

Problem occours only on ata3, I've changed the disk Port 3 the third time (new disks) and changed the SATA cable, too. Problem still exists.

Sometimes a RAID rebuild doesn't work at all.

To get the drive to live I've to power down the system.

Logs are attached.

Can it be a bug on concurrent access of smartctl/smartd?

Any ideas?

Ciao,
Gerhard

--
http://www.wiesinger.com/

Jul 7 15:56:22 big8 kernel: md: data-check of RAID array md0
Jul 7 15:56:22 big8 kernel: md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
Jul 7 15:56:22 big8 kernel: md: using maximum available idle IO bandwidth (but not more than 80000 KB/sec) for data-check.
Jul 7 15:56:22 big8 kernel: md: using 128k window, over a total of 976759808 blocks.
Jul 7 15:56:26 big8 kernel: md: data-check of RAID array md1
Jul 7 15:56:26 big8 kernel: md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
Jul 7 15:56:26 big8 kernel: md: using maximum available idle IO bandwidth (but not more than 80000 KB/sec) for data-check.
Jul 7 15:56:26 big8 kernel: md: using 128k window, over a total of 976759808 blocks.
Jul 7 16:00:53 big8 kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
Jul 7 16:00:53 big8 kernel: ata3.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
Jul 7 16:00:53 big8 kernel: res 40/00:00:00:00:00/00:00:00:00:00/a0 Emask 0x4 (timeout)
Jul 7 16:00:53 big8 kernel: ata3.00: status: { DRDY }
Jul 7 16:00:53 big8 kernel: ata3: hard resetting link
Jul 7 16:01:00 big8 kernel: ata3: port is slow to respond, please be patient (Status 0x80)
Jul 7 16:01:03 big8 kernel: ata3: softreset failed (device not ready)
Jul 7 16:01:03 big8 kernel: ata3: hard resetting link
Jul 7 16:01:10 big8 kernel: ata3: port is slow to respond, please be patient (Status 0x80)
Jul 7 16:01:13 big8 kernel: ata3: softreset failed (device not ready)
Jul 7 16:01:13 big8 kernel: ata3: hard resetting link
Jul 7 16:01:20 big8 kernel: ata3: port is slow to respond, please be patient (Status 0x80)
Jul 7 16:01:48 big8 kernel: ata3: softreset failed (device not ready)
Jul 7 16:01:48 big8 kernel: ata3: limiting SATA link speed to 1.5 Gbps
Jul 7 16:01:48 big8 kernel: ata3: hard resetting link
Jul 7 16:01:53 big8 kernel: ata3: softreset failed (device not ready)
Jul 7 16:01:53 big8 kernel: ata3: reset failed, giving up
Jul 7 16:01:53 big8 kernel: ata3.00: disabled
Jul 7 16:01:53 big8 kernel: ata3: EH complete
Jul 7 16:01:53 big8 kernel: sd 2:0:0:0: [sdc] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Jul 7 16:01:53 big8 kernel: end_request: I/O error, dev sdc, sector 29123647
Jul 7 16:01:53 big8 kernel: sd 2:0:0:0: [sdc] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Jul 7 16:01:53 big8 kernel: end_request: I/O error, dev sdc, sector 29124927
Jul 7 16:01:53 big8 kernel: sd 2:0:0:0: [sdc] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Jul 7 16:01:53 big8 kernel: end_request: I/O error, dev sdc, sector 29123903
Jul 7 16:01:53 big8 kernel: sd 2:0:0:0: [sdc] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Jul 7 16:01:53 big8 kernel: end_request: I/O error, dev sdc, sector 1182894207
Jul 7 16:01:53 big8 kernel: sd 2:0:0:0: [sdc] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Jul 7 16:01:53 big8 kernel: end_request: I/O error, dev sdc, sector 1139671639
Jul 7 16:01:53 big8 kernel: sd 2:0:0:0: [sdc] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Jul 7 16:01:53 big8 kernel: end_request: I/O error, dev sdc, sector 1557222687
Jul 7 16:01:53 big8 kernel: sd 2:0:0:0: [sdc] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Jul 7 16:01:53 big8 kernel: end_request: I/O error, dev sdc, sector 29123647
Jul 7 16:01:53 big8 kernel: raid5: Disk failure on sdc1, disabling device. Operation continuing on 2 devices
Jul 7 16:01:53 big8 kernel: sd 2:0:0:0: [sdc] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Jul 7 16:01:53 big8 kernel: end_request: I/O error, dev sdc, sector 29124671
Jul 7 16:01:53 big8 kernel: md: md0: data-check done.
Jul 7 16:01:53 big8 kernel: RAID5 conf printout:
Jul 7 16:01:53 big8 kernel: --- rd:3 wd:2
Jul 7 16:01:53 big8 kernel: disk 0, o:1, dev:sda1
Jul 7 16:01:53 big8 kernel: disk 1, o:1, dev:sdb1
Jul 7 16:01:53 big8 kernel: disk 2, o:0, dev:sdc1
Jul 7 16:01:53 big8 kernel: RAID5 conf printout:
Jul 7 16:01:53 big8 kernel: --- rd:3 wd:2
Jul 7 16:01:53 big8 kernel: disk 0, o:1, dev:sda1
Jul 7 16:01:53 big8 kernel: disk 1, o:1, dev:sdb1
Jul 7 16:02:45 big8 kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO