Re: Did my SATA drive fail? What should I do?

From: Nathan
Date: Sun Jan 25 2009 - 22:45:29 EST


On Sat, Jan 24, 2009 at 1:34 AM, Robert Hancock <hancockr@xxxxxxx> wrote:
> Nathan wrote:
>>
>> First off: Hi! I'm new to this list. Googling some of the errors in
>> my logs brought up several posts to this list, none of which led to
>> any obvious resolutions, so I thought I'd try joining the list...
>>
>> I have a linux server[1] that has been acting as an asterisk pbx for
>> about three months without a problem. Today, the SATA hard drive
>> suddenly went into read-only mode (!?). Reading seems to work just
>> fine, as I've ssh'd in and looked at a bunch of files, including
>> /var/log/messages [2].
>>
>> Is this just my hardware failing? Could this a kernel bug?
>> Recommendations? I would just reboot the thing, but asterisk is still
>> successfully handling dozens of calls right now, despite not being
>> able to write to the disk...
>>
>> [1] # uname -a
>> Linux phoneserver3 2.6.25-gentoo-r7 #2 SMP Thu Oct 16 20:59:53 MDT
>> 2008 x86_64 Intel(R) Xeon(R) CPU E5410 @ 2.33GHz GenuineIntel
>> GNU/Linux
>>
>> [2] (last several lines from /var/log/messages)
>>
>> Jan 22 14:43:09 phoneserver3 dhcpd: DHCPDISCOVER from
>> 00:04:f2:1e:7e:21 via eth2: network 10.254.254/24: no free leases
>> Jan 22 14:43:09 phoneserver3 dhcpd: DHCPDISCOVER from
>> 00:04:f2:1e:7c:6b via eth2: network 10.254.254/24: no free leases
>> Jan 22 14:43:35 phoneserver3 dhcpd: DHCPDISCOVER from
>> 00:04:f2:1e:0d:47 via eth2: network 10.254.254/24: no free leases
>> Jan 22 14:43:36 phoneserver3 dhcpd: DHCPDISCOVER from
>> 00:1f:28:83:00:80 via eth2: network 10.254.254/24: no free leases
>> Jan 22 14:43:41 phoneserver3 ata1.00: exception Emask 0x0 SAct 0x0
>> SErr 0x0 action 0x0
>> Jan 22 14:43:41 phoneserver3 ata1.00: BMDMA stat 0x25
>> Jan 22 14:43:41 phoneserver3 ata1.00: cmd
>> ca/00:20:00:7c:ef/00:00:00:00:00/ef tag 0 dma 16384 out
>> Jan 22 14:43:41 phoneserver3 res 51/10:20:00:7c:ef/00:00:00:00:00/ef
>> Emask 0x81 (invalid argument)
>> Jan 22 14:43:41 phoneserver3 ata1.00: status: { DRDY ERR }
>> Jan 22 14:43:41 phoneserver3 ata1.00: error: { IDNF }
>
> Most likely this is the disk failing. It's reporting the sector wasn't found
> during a write operation.
>

Thank you for your response! I will replace the disk just to be safe.

~ Nathan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/