Re: MVSAS 1669:mvs_abort_task:rc= 5

From: Thomas Fjellstrom
Date: Wed Oct 14 2009 - 03:19:58 EST


On Tue October 13 2009, andy yan wrote:
> I will send you a patch for debugging this issue, please help to try and
> send back the log, thanks!

I will do whatever I can to help get this resolved :) I have some C skills,
but no kernel/device driver experience, so at the very least I should be able
to do builds and make small changes if needed, in addition to patching and
endless reboots ;D

> On Wed, Oct 14, 2009 at 9:39 AM, Thomas Fjellstrom
<tfjellstrom@xxxxxxx>wrote:
> > On Sun October 11 2009, Thomas Fjellstrom wrote:
> > > On Sun October 11 2009, Christian Vilhelm wrote:
> > > > Thomas Fjellstrom wrote:
> > > > > Hi,
> > > > >
> > > > > I've been trying to get an AOC-SASLP-MV8 card (pcie x4 2 port SAS
> >
> > card)
> >
> > > > > to work with linux for the past month or so. I've recently just
> > > > > RMAed my first card, and tested the new one under linux, and I see
> > > > > the same problems.
> > > > >
> > > > > The very first time I made a new array off the controller, formated
> > > > > (with xfs) and mounted the volume, it seemed to work. ioozone even
> > > > > seemed to run for a while. Sadly after a few minutes I got a stream
> >
> > of
> >
> > > > > mvs_abort_task messages in dmesg, and any accesses to the volume,
> > > > > or any disks connected to the controller lock up.
> > > > >
> > > > > After that I updated my 2.6.31 kernel to 2.6.32-rc3-git2 off of
> > > > > kernel.org, and the volume fails to mount with the same
> >
> > mvs_abort_task
> >
> > > > > messages.
> > > >
> > > > I have the exact same problem with another Marvell 88SE64xx based
> > > > card, namely an Areca ARC-1300ix-16 and the mvsas driver.
> > > > If the disks are just used alone, with a filesystem on them, all
> > > > seems to work fine. dd and badblocks run fine on them. Mounting them,
> > > > reading/writing work fine. The error seem to popup but rarely when
> > > > several disks are used simultaneously.
> > > > But, an absolute sure way to trigger the error is to assemble (or
> > > > create) a md raid array with the disks. I join a syslog extract from
> >
> > the
> >
> > > > error. You can see it happens seconds after the array creation.
> > > > I tried :
> > > > 1) disabling the write cache on the disks => same error
> > > > 2) disabling NCQ : in mv_sas.h :
> > > > #define MV_DISABLE_NCQ 1
> > > > same error.
> > > > Afer a while, the devices handled by the card are just dropped from
> > > > the system and the card stops working at all, a reboot is necessary.
> > >
> > > I have found that a proper reboot is impossible once the card/driver
> >
> > starts
> >
> > > misbehaving. Anything that tries to do anything with the md device, or
> >
> > any
> >
> > > of the component drives will hang. Even kernel threads it seems. A
> >
> > reboot
> >
> > > or a shutdown hangs when it tries to sync the md device, and
> >
> > ALT+SYSRQ+S/U
> >
> > > both hang. After the first Alt+sysrq+s it will register more of them,
> >
> > but
> >
> > > it won't print the "Emergency Sync Complete" message.
> > >
> > > > Does anyone have a working config based on a Marvell 64xx card ?
> > > >
> > > > I'm willing to explore solutions, patches or anything, just tell me
> >
> > what
> >
> > > > to do to help.
> > > >
> > > > Christian Vilhelm.
> >
> > I'd really appreciate some assistance with this. The card is essentially
> > useless under linux, if not harmful (causes oopses and hangs) with the
> > current
> > driver.
> >
> > My last weekly backup failed while creating the disk image due to my
> > array being low on space, I really need to get the new array up asap.
> >
> > Thanks.
> >
> > --
> > Thomas Fjellstrom
> > tfjellstrom@xxxxxxx
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
>


--
Thomas Fjellstrom
tfjellstrom@xxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/