Re: sata_svw data corruption, strange problems

From: Benjamin Herrenschmidt
Date: Mon Jun 23 2008 - 05:23:49 EST


On Mon, 2008-06-23 at 11:01 +0200, Pavel Machek wrote:
> On Mon 2008-06-23 17:56:32, Tejun Heo wrote:
> > Pavel Machek wrote:
> > > On Mon 2008-06-23 10:39:40, Andreas Schwab wrote:
> > >> Pavel Machek <pavel@xxxxxxx> writes:
> > >>
> > >>> + controller, the controller could hang. In other cases it
> > >>> + could return partial data returning in data
> > >>> + corruption. This problem has been seen in PPC systems and
> > >> s/returning/resulting/ ?
> > >
> > > Fix thinko in sata_svw comment.
> > >
> > > Signed-off-by: Pavel Machek <pavel@xxxxxxx>
> >
> > Please collapse into one patch. Thanks.

Am I the only one to find Pavel variant almost as obscure as
the original one ? :-)

It should explain precisely what the workaround is. Ie. to start the
DMA there instead of where it normally is started which is the
bmdma_setup() function.

BTW. Tejun, I suppose that usually starting DMA after issuing the
command is a standard practice of legacy/sff type controllers ? Or it's
just because that's how linux did it until now ?

Ben.

> ---
>
> Clarify comment in sata_svw.c.
>
> Signed-off-by: Pavel Machek <pavel@xxxxxxx>
>
> diff --git a/drivers/ata/sata_svw.c b/drivers/ata/sata_svw.c
> index 16aa683..fb13b82 100644
> --- a/drivers/ata/sata_svw.c
> +++ b/drivers/ata/sata_svw.c
> @@ -253,21 +253,29 @@ static void k2_bmdma_start_mmio(struct a
> /* start host DMA transaction */
> dmactl = readb(mmio + ATA_DMA_CMD);
> writeb(dmactl | ATA_DMA_START, mmio + ATA_DMA_CMD);
> - /* There is a race condition in certain SATA controllers that can
> - be seen when the r/w command is given to the controller before the
> - host DMA is started. On a Read command, the controller would initiate
> - the command to the drive even before it sees the DMA start. When there
> - are very fast drives connected to the controller, or when the data request
> - hits in the drive cache, there is the possibility that the drive returns a part
> - or all of the requested data to the controller before the DMA start is issued.
> - In this case, the controller would become confused as to what to do with the data.
> - In the worst case when all the data is returned back to the controller, the
> - controller could hang. In other cases it could return partial data returning
> - in data corruption. This problem has been seen in PPC systems and can also appear
> - on an system with very fast disks, where the SATA controller is sitting behind a
> - number of bridges, and hence there is significant latency between the r/w command
> - and the start command. */
> - /* issue r/w command if the access is to ATA*/
> + /* This works around possible data corruption.
> +
> + On certain SATA controllers that can be seen when the r/w
> + command is given to the controller before the host DMA is
> + started.
> +
> + On a Read command, the controller would initiate the
> + command to the drive even before it sees the DMA
> + start. When there are very fast drives connected to the
> + controller, or when the data request hits in the drive
> + cache, there is the possibility that the drive returns a
> + part or all of the requested data to the controller before
> + the DMA start is issued. In this case, the controller
> + would become confused as to what to do with the data. In
> + the worst case when all the data is returned back to the
> + controller, the controller could hang. In other cases it
> + could return partial data returning in data
> + corruption. This problem has been seen in PPC systems and
> + can also appear on an system with very fast disks, where
> + the SATA controller is sitting behind a number of bridges,
> + and hence there is significant latency between the r/w
> + command and the start command. */
> + /* issue r/w command if the access is to ATA */
> if (qc->tf.protocol == ATA_PROT_DMA)
> ap->ops->sff_exec_command(ap, &qc->tf);
> }
>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/