Re: 2.6.29 regression: ATA bus errors on resume

From: Berthold Gunreben
Date: Wed Sep 30 2009 - 05:58:55 EST


Hi Tejun,

thanks a lot for your reply.

Am Freitag 25 September 2009 schrieb Tejun Heo:
> Hello, Berthold.
>
> The disk is most likely losing power briefly. After boot, run
> "smartctl -a" on the device and record the output. After triggering
> the problem, do it again. See if Start_Stop_Count, Power_Cycle_Count
> or Power-Off_Retract_Count has increased. If so, take out your PSU,
> bury it half-deep in your backyard, apply some gasoline, light it up
> and enjoy the sight of perishing evil with a can of beer.

You might be right. However, I cannot reproduce the problem anymore, since I
switched to the totally unsupported JFS as filesystem.

In the meantime, I was able to copy 1.5TB of data back to the array, and the
system also survived artificially generated high load. If the problem is a
race (which I do not know), it might still be there. Obviously, it does not
show up as often again.

It could still be the power supply of course, but I don't understand why a new
kernel would trigger power outages so often (current kernels triggered the
problem latest after 5 minutes). Maybe it has something to do with the
chipset (ICH7R) which is capable of hot remove/add disks. Or it is related to
the hotswap harddisk slots in the case
(http://www.chenbro.eu/corporatesite/products_detail.php?sku=79 ). I have no
idea....

Thanks

Berthold

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/