Re: Fwd: libata-pmp patch for 3.2.x and later for eSATA PortMultiplier Sil3726

From: Gwendal Grignou
Date: Sun Mar 25 2012 - 11:28:52 EST


I reread your logs.

Assuming you don't mind long boot from cold power, the remaining
problem is with the 4 disk enclosures [ata7-ata10] on the second
machines where the first disk is not found and boot from warm reboot
is very long.
I try to understand why it works with the other 4 enclosures
[ata5-ata6] on the first and second machines.

Also, just to be sure I understand you configuration correctly, your
second machine has 30 disks total, not 40:
2 direct on ata1.00 and ata1.01
8 on 2 enclosures [ 2 * 4] on ata5 and ata6
20 on 4 enclosures [ 4 * 5] on ata7 - ata10

Also, from the log, ata5 and ata6 is behind a Sil3132 based
controller, while ata7-ata10 behind a single Sil3124, not the opposite
as you said in a precedent mail.

If possible, could you switch 2 of the 4 enclosures [with their disks]
that fails to the port controlled by the Sil3132 controller, reboot
the machine with all its 30 drives and see if the failures follow the
controller or the enclosure.
If you based your raid configuration on signature that should be fine,
but if it based on kernel device name [sdX] that will confuse md and
will mess with your data.

I am sorry I don't have any other suggestion right now,

Regards,
Gwendal.

On Sat, Mar 24, 2012 at 6:19 PM, ANEZAKI, Akira
<fireblade1230@xxxxxxxxxxx> wrote:
> Hello Gwendal,
>
> I want to confirm one thing.
> The kernel 3.1.x driver still works?
>
> It seems to take long time to solve the problem. Of course I understand
> staggered spin-up is better solution. But I can't wait it so long. And
> it affects only SiI3726 only.
>
> Best Regards,
> Akira
>
> (2012/03/23 18:59), ANEZAKI, Akira wrote:
>> Hello Gwendal,
>>
>> (2012/03/23 17:31), Gwendal Grignou wrote:
>>>>>>> I notice however some messages I did not see before:
>>>>>>>>> [    4.856382] ata7.15: Port Multiplier 1.1, 0x1095:0x3726 r23, 6 ports, feat 0x1/0x9
>>>>>>>>> [    4.858742] ata7.00: hard resetting link
>>>>>>>>> [   14.843039] ata7.00: softreset failed (timeout)
>>>>>>>>> [   17.836402] ata7.15: qc timeout (cmd 0xe4)
>>>>>>> The later indicates that the PMP is stuck and the host can not read
>>>>>>> its internal register.
>>>>>>> Is it possible that the PMP in these 4 enclosures you are using have a
>>>>>>> different firmware than the other ones?
>>>>>>> Firmware 1.0114 is available at:
>>>>>>> http://www.siliconimage.com/support/searchresults.aspx?pid=26&cat=23
>>>>>>>
>>>>>>> From the release notes:
>>>>>>> """- Fix SRST and initial two RegFIS Problem."""
>>>>
>>>> I'm still fixing broken RAID. Sorry for my slow response.
>>
>> I checked those firmware version. All of them use version 1.0114.
>>
>> Best Regards,
>> Akira
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/