Re: [PATCH 5/5] dmaengine: dw: Add DMA-channels mask cell support

From: Serge Semin
Date: Thu Jul 30 2020 - 13:12:02 EST


On Thu, Jul 30, 2020 at 07:41:46PM +0300, Andy Shevchenko wrote:
> On Thu, Jul 30, 2020 at 06:45:45PM +0300, Serge Semin wrote:
> > DW DMA IP-core provides a way to synthesize the DMA controller with
> > channels having different parameters like maximum burst-length,
> > multi-block support, maximum data width, etc. Those parameters both
> > explicitly and implicitly affect the channels performance. Since DMA slave
> > devices might be very demanding to the DMA performance, let's provide a
> > functionality for the slaves to be assigned with DW DMA channels, which
> > performance according to the platform engineer fulfill their requirements.
> > After this patch is applied it can be done by passing the mask of suitable
> > DMA-channels either directly in the dw_dma_slave structure instance or as
> > a fifth cell of the DMA DT-property. If mask is zero or not provided, then
> > there is no limitation on the channels allocation.
> >
> > For instance Baikal-T1 SoC is equipped with a DW DMAC engine, which first
> > two channels are synthesized with max burst length of 16, while the rest
> > of the channels have been created with max-burst-len=4. It would seem that
> > the first two channels must be faster than the others and should be more
> > preferable for the time-critical DMA slave devices. In practice it turned
> > out that the situation is quite the opposite. The channels with
> > max-burst-len=4 demonstrated a better performance than the channels with
> > max-burst-len=16 even when they both had been initialized with the same
> > settings. The performance drop of the first two DMA-channels made them
> > unsuitable for the DW APB SSI slave device. No matter what settings they
> > are configured with, full-duplex SPI transfers occasionally experience the
> > Rx FIFO overflow. It means that the DMA-engine doesn't keep up with
> > incoming data pace even though the SPI-bus is enabled with speed of 25MHz
> > while the DW DMA controller is clocked with 50MHz signal. There is no such
> > problem has been noticed for the channels synthesized with
> > max-burst-len=4.
>
> ...
>

> > + if (dws->channels && !(dws->channels & dwc->mask))
>
> You can drop the first check if...

See below.

>
> > + return false;
>
> ...
>
> > + if (dma_spec->args_count >= 4)
> > + slave.channels = dma_spec->args[3];
>
> ...you apply sane default here or somewhere else.

Alas I can't because dw_dma_slave structure is defined all over the kernel
drivers/spi/spi-dw-dma.c
drivers/spi/spi-pxa2xx-pci.c
drivers/tty/serial/8250/8250_lpss.c

These devices aren't always placed on the OF-based platforms. In that case the
corresponding DMA-channels won't be requested by means of the dw_dma_of_xlate()
method. So we have to preserve a default behavior if dws->channels is zero.

>
> ...
>
> > + fls(slave.channels) > dw->pdata->nr_channels))
>

> Does it really make sense?

It does to prevent the clients to specify an invalid channels mask, which can't
have bits set higher than the number of channels the engine supports.

>
> I think it can also be simplified to faster op, i.e.
> BIT(nr_channels) < slave.channels
> (but check for off-by-one errors)

Makes sense. Thanks. I'll replace it with the next statement:
slave.channels >= BIT(dw->pdata->nr_channels)

>
> ...
>

> > + * @channels: mask of the channels permitted for allocation (zero
> > + * value means any)
>
> Perhaps on one line?

I don't really care. If you insist on that, I'll make it a single line, but it
will be over 80 columns. 85 to be exact.

-Sergey

>
> --
> With Best Regards,
> Andy Shevchenko
>
>