Re: SCSI or libata problem with an RDX removable disk

From: Pascal GREGIS
Date: Wed Sep 10 2008 - 04:42:58 EST

Next message: Ian Campbell: "Re: NFS regression? Odd delays and lockups accessing an NFS export."
Previous message: Christian Borntraeger: "Re: warn_on regression after 2.6.27-rc5"
In reply to: Mark Lord: "Re: SCSI or libata problem with an RDX removable disk"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi Mark and Alan,

Thank you for your answers and for the patch Mark sent in his mail.
I had found just before this patch :
http://kerneltrap.org/mailarchive/linux-kernel/2007/9/27/324334
which seems to be different from the one you sent.

What does cuase this difference, do these two patches apply on different kernel versions or do they correspond to different revisions of the patch, one being fixed compared to the other?

Thank you

Pascal

Mark Lord a écrit, le lun 08 sep 2008 à 02:58:17 :
> Alan Cox wrote:
> >>Sep 4 08:03:08 devsni1 kernel: ata4: port is slow to respond, please be
> >>patient (Status 0xd0)
> >>Sep 4 08:03:31 devsni1 kernel: ata4: port failed to respond (30 secs,
> >>Status 0xd0)
> >>Sep 4 08:03:31 devsni1 kernel: ata4: soft resetting port
> >>Sep 4 08:03:32 devsni1 kernel: ATA: abnormal status 0xD0 on port
> >>0x0001d807
> >>Sep 4 08:03:32 devsni1 last message repeated 4 times
> >
> >Your disk went offline and then refused to come back when the link was
> >reset. The initial trigger appears to have been the drive, the fact it
> >didn't come back could either be the drive or a controller problem. We've
> >seen a few cases where devices or controllers fail to recover from one
> >end being stuck expecting data.
> >
> >Mark Lord did some patches to try and drain data in this case but I don't
> >remember if they were merged yet.
> ..
>
> That would be this patch, currently not merged, not maintained,
> and probably needs rework for some chipsets. But for the record:
>
>
> Tejun Heo wrote:
> >Jeff Garzik wrote:
> >>Tejun Heo wrote:
> >>>Alan Cox wrote:
> >>>>>I think there have been enough cases where this draining was necessary.
> >>>>> IIRC, ata_piix was involved in those cases, right? If so, can you
> >>>>>please submit a patch which applies this only to affected controllers?
> >>>>>I don't feel too confident about applying this to all SFF controllers.
> >>>>Old IDE does it on all controllers bar a couple. So we have a very good
> >>>>knowledge of what does/doesn't work. The one that needs care in old ide
> >>>>is an ordering issue where a state machine reset done first causes the
> >>>>drain of the I/O to hang.
> >>>Hmmm... So, do we apply draining to all PATA? Or is ata_piix SATA
> >>>affected too?
> >>I would think all SFF controllers, since a lot of first gen SATA are
> >>really bridged solutions. If they are flagging DRQ, I say oblige them :)
> >
> >Alright, then the posted patch should be good enough. Mark, can you be
> >bothered to regenerate the patch and post it one more time (again)? It
> >seems we all agree the update is needed.
>
> I think this original patch still applies cleanly on at least 2.6.23-rc7.
>
> Drain up to 512 words from host/bridge FIFO on stuck DRQ HSM violation,
> rather than just getting stuck there forever.
>
> Signed-off-by: Mark Lord <mlord@xxxxxxxxx>
> ---
>
> --- old/drivers/ata/libata-sff.c 2007-09-28 09:29:22.000000000 -0400
> +++ linux/drivers/ata/libata-sff.c 2007-09-28 09:39:44.000000000 -0400
> @@ -420,6 +420,28 @@
> ap->ops->irq_on(ap);
> }
>
> +static void ata_drain_fifo(struct ata_port *ap, struct ata_queued_cmd *qc)
> +{
> + u8 stat = ata_chk_status(ap);
> + /*
> + * Try to clear stuck DRQ if necessary,
> + * by reading/discarding up to two sectors worth of data.
> + */
> + if ((stat & ATA_DRQ) && (!qc || qc->dma_dir != DMA_TO_DEVICE)) {
> + unsigned int i;
> + unsigned int limit = qc ? qc->sect_size : ATA_SECT_SIZE;
> +
> + printk(KERN_WARNING "Draining up to %u words from data
> FIFO.\n",
> + limit);
> + for (i = 0; i < limit ; ++i) {
> + ioread16(ap->ioaddr.data_addr);
> + if (!(ata_chk_status(ap) & ATA_DRQ))
> + break;
> + }
> + printk(KERN_WARNING "Drained %u/%u words.\n", i, limit);
> + }
> +}
> +
> /**
> * ata_bmdma_drive_eh - Perform EH with given methods for BMDMA
> controller
> * @ap: port to handle error for
> @@ -476,7 +498,7 @@
> }
>
> ata_altstatus(ap);
> - ata_chk_status(ap);
> + ata_drain_fifo(ap, qc);
> ap->ops->irq_clear(ap);
>
> spin_unlock_irqrestore(ap->lock, flags);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Ian Campbell: "Re: NFS regression? Odd delays and lockups accessing an NFS export."
Previous message: Christian Borntraeger: "Re: warn_on regression after 2.6.27-rc5"
In reply to: Mark Lord: "Re: SCSI or libata problem with an RDX removable disk"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]