Re: [PATCH v3 01/16] spi: dw: Add Tx/Rx finish wait methods to the MID DMA

From: Feng Tang
Date: Fri May 22 2020 - 08:03:35 EST


On Fri, May 22, 2020 at 02:32:35PM +0300, Serge Semin wrote:
> On Fri, May 22, 2020 at 03:58:44PM +0800, Feng Tang wrote:
> > Hi Serge,
> >
> > On Thu, May 21, 2020 at 06:33:17PM +0300, Serge Semin wrote:
> > > > > > > + dw_spi_dma_wait_rx_done(dws);
> > > > > >
> > > > > > I can understand the problem about TX, but I don't see how RX
> > > > > > will get hurt, can you elaborate more? thanks
> > > > > >
> > > > > > - Feng
> > > > >
> > > > > Your question is correct. You are right with your hypothesis. Ideally upon the
> > > > > dw_spi_dma_rx_done() execution Rx FIFO must be already empty. That's why the
> > > > > commit log signifies the error being mostly related with Tx FIFO. But
> > > > > practically there are many reasons why Rx FIFO might be left with data:
> > > > > DMA engine failures, incorrect DMA configuration (if DW SPI or DW DMA driver
> > > > > messed something up), controller hanging up, and so on. It's better to catch
> > > > > an error at this stage while propagating it up to the SPI device drivers.
> > > > > Especially seeing the wait-check implementation doesn't gives us much of the
> > > > > execution overhead in normal conditions. So by calling dw_spi_dma_wait_rx_done()
> > > > > we make sure that all the data has been fetched and we may freely get the
> > > > > buffers back to the client driver.
> > > >
> > > > I see your point about checking RX. But I still don't think checking
> > > > RX FIFO level is the right way to detect error. Some data left in
> > > > RX FIFO doesn't always mean a error, say for some case if there is
> > > > 20 words in RX FIFO, and the driver starts a DMA request for 16
> > > > words, then after a sucessful DMA transaction, there are 4 words
> > > > left without any error.
> > >
> > > Neither Tx nor Rx FIFO should be left with any data after transaction is
> > > finished. If they are then something has been wrong.
> > >
> > > See, every SPI transfer starts with FIFO clearance since we disable/enable the
> > > SPI controller by means of the SSIENR (spi_enable_chip(dws, 0) and
> > > spi_enable_chip(dws, 1) called in the dw_spi_transfer_one() callback). Here is the
> > > SSIENR register description: "It enables and disables all SPI Controller operations.
> > > When disabled, all serial transfers are halted immediately. Transmit and receive
> > > FIFO buffers are cleared when the device is disabled. It is impossible to program
> > > some of the SPI Controller control registers when enabled"
> > >
> > > No mater whether we start DMA request or perform the normal IRQ-based PIO, we
> > > request as much data as we need and neither Tx nor Rx FIFO are supposed to
> > > be left with any data after the request is finished. If data is left, then
> > > either we didn't push all of the necessary data to the SPI bus, or we didn't
> > > pull all the data from the FIFO, and this could have happened only due to some
> > > component mulfunction (drivers, DMA engine, SPI device). In any case the SPI
> > > device driver should be notified about the problem.
> >
> > Data left in TX FIFO and Data left in RX FIFO are 2 different stories. The
> > former in dma case means the dma hw/driver has done its job, and spi hw/driver
> > hasn't done its job of pushing out the data to spi slave devices,
>
> Agreed.
>
> > while the
> > latter means the spi hw/driver has done its job, while the dma hw/driver hasn't.
>
> In this particular case agreed, that the data left in the Rx FIFO means DMA
> hw/driver hasn't done its work right. Though SPI hw could be also a reason of
> the data left in FIFO (though this only a theoretical consideration).

Right, that's why I was initially very curious about this RX FIFO thing,
and if possible, please give some details in commit log about the data
left in TX FIFO problem, which will help future developers when they
met simliar bugs.

And I'm fine with adding the rx check, no matter the problem is in
dma side or spi side.

> >
> > And the code is called inside the dma rx channel callback, which means the
> > dma driver is saying "hey, I've done my job", but apparently it hasn't if
> > there is data left.
>
> Right, either it hasn't, or the DMA engine claimed it has, but still is doing
> something (asynchronously or something, depending on the hardware implementation),
> or it think it has, but in fact it hasn't due to whatever problem happened
> (software/hardware/etc.). In anyway we have to at least check whether it's
> really done with fetching data and to be on a safe side give it some time to
> make sure that the Rx FIFO isn't going to be emptied. Whatever problem it is
> having a non empty Rx FIFO at the stage of calling spi_finalize_current_transfer()
> means a certain error.
>
> >
> > As for the wait time
> >
> > + nents = dw_readl(dws, DW_SPI_RXFLR);
> > + ns = (NSEC_PER_SEC / spi_get_clk(dws)) * nents * dws->n_bytes *
> > + BITS_PER_BYTE;
> >
> > Using this formula for checking TX makes sense, but it doesn't for RX.
> > Because the time of pushing data in TX FIFO to spi device depends on
> > the clk, but the time of transferring RX FIFO to memory is up to
> > the DMA controller and peripheral bus.
>
> On this I agree with you. That formulae doesn't describe exactly the time left
> before the Rx FIFO gets empty. But at least it provides an upper limit on the
> time needed for the peripheral bus to fetch the data from FIFO. If for some
> reason the internal APB bus is slower than the SPI bus, then the hardware
> engineers screwed, since the CPU/DMA won't keep up with pulling data from Rx
> FIFO on time so the FIFO may get overflown. Though in this case CPU/DMA won't
> be able to push data to the Tx FIFO fast enough to cause the Rx FIFO overflown,
> so the problem might be unnoticeable until we enable the EEPROM-read or Rx-only
> modes of the DW APB SSI controller. Anyway I am pretty much sure all the systems
> have the internal bus much faster than the external SPI bus.
>
> Getting back to the formulae. I was thinking of how to make it better and here
> is what we can do. We can't predict neither the DMA controller performance,
> nor the performance of its driver. In this case we have no choice but to add
> some assumption to clarify the task. Let's assume that the reason why Rx FIFO is
> non-empty is that even though we are at the DMA completion callback, but the
> DMA controller is still fetching data in background (any other reason might be
> related with a bug, so we'll detect it here anyway). In this case we need to
> give it a time to finish its work. As far as I can see the DW_apb_ssi interface
> doesn't use PREADY APB signal, which means the IO access cycle will take 4
> reference clock periods for each read and write accesses. Thus taking all of
> these into account we can create the next formulae to measure the time needed to
> read all the data from the Rx FIFO:
>
> - ns = (NSEC_PER_SEC / spi_get_clk(dws)) * nents * dws->n_bytes *
> - BITS_PER_BYTE;
> + ns = (NSEC_PER_SEC / dws->max_freq) * nents * 4;
>
> By doing several busy-wait loop iteration we'll cover the DMA controller and
> it's driver possible latency.
>
> Feng, does it now makes sense for you now? If so, I'll replace the delay
> calculation formulae in the patch.

Frankly I don't have a good idea, if it really happens which means
something is abnormal, explicitly waiting for some micro-seconds may
also be acceptable?

> >
> > Also for the
> >
> > + while (dw_spi_dma_rx_busy(dws) && retry--)
> > + ndelay(ns);
> > +
> >
> > the rx busy bit is cleared after this rx/tx checking, and it should
> > be always true at this point. Am I mis-reading the code?
>
> Sorry I don't get your logic here. I am not checking the Rx busy bit here,
> but the Rx FIFO non-empty bit. Also SR register bits aren't cleared on read,
> so the status bits are left pending until the reason is cleared. In our case
> until Rx FIFO gets empty, which will happen eventually either at the point of
> all data finally being extracted from it or when the controller is disabled
> by means of the SSIENR register.

I did misread the code, I thought it is checking the busy bits, sorry
for that. Though the dw_spi_dma_rx_busy() name is a little confusing,
as checking the emptiness of RX FIFO is not dma bound.

Thanks,
Feng