Re: exception Emask 0x0 SAct 0x1 / SErr 0x0 action 0x2 frozen

From: Justin Piszcz
Date: Fri Oct 10 2008 - 15:13:41 EST




On Sat, 4 Oct 2008, Justin Piszcz wrote:



On Sat, 4 Oct 2008, Tejun Heo wrote:

Justin Piszcz wrote:


What do these signifiers mean (they are always the same, no matter the
controller used OR the disk in question (happens across 12 disks and 3 different controllers)):

[420781.333179] ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[420781.333189] ata6.00: cmd b0/d8:00:00:4f:c2/00:00:00:00:00/00 tag 0
^^ ^^(b0/d8)^^ ^^(4f:c2)
[420781.333190] res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4
(timeout) ^^ 40:00:ff
[420781.333194] ata6.00: status: { DRDY }
[420781.333200] ata6: hard resetting link
[420781.638589] ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[420781.662166] ata6.00: configured for UDMA/133
[420781.662166] ata6: EH complete

(at the time there was little to no I/O occuring on this block device, but disks on the raid5 volume were being accessed at the time, so there was system activity, mainly disk reads 300-500KiB/s over ethernet)

Nick's(?) problem:

Nick

ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata1.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
^^ ^^ (ea/00) vs. (b0/d8) - mind are always the same (FYI)
res 40/00:00:00:00:00/00:00:00:00:00/a0 Emask 0x4 (timeout)
^^ ^^ (40:00: but no ff)

The rest of the messages are the same. Is there any correlation
that can be made here? When this happens to others, is it
always the same codes as shown above or do they change? If they
do not change, how come they vary between users who have this
problem?

ata1.00: status: { DRDY }
ata1: soft resetting link
ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata1.00: configured for UDMA/133
ata1: EH complete
sd 1:0:0:0: [sda] 2930277168 512-byte hardware sectors (1500302 MB)
sd 1:0:0:0: [sda] Write Protect is off
sd 1:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 1:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't
support DPO or FUA

Can anything be said about these errors, can we classify them into groups?
Or are they just random? It does not appear to happen more or less with one filesystem or another either, one guy is using ext3, I am using XFS-- certainly something much deeper..

Justin.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/