Re: [PATCH] char: xillybus: Eliminate redundant wrappers to DMA related calls

From: Arnd Bergmann
Date: Mon Sep 27 2021 - 03:47:22 EST


On Sun, Sep 26, 2021 at 9:31 AM <eli.billauer@xxxxxxxxx> wrote:
>
> From: Eli Billauer <eli.billauer@xxxxxxxxx>
>
> The driver was originally written with the assumption that a different
> API must be used for DMA-related functions if the device is PCIe based
> or if not. Since Xillybus' driver supports devices on a PCIe bus (with
> xillybus_pcie) as well as connected directly to the processor (with
> xillybus_of), it originally used wrapper functions that ensure that
> a different API is used for each.
>
> This patch eliminates the said wrapper functions, as all use the same
> dma_* API now. This is most notable by the code deleted in xillybus_pcie.c
> and xillybus_of.c.
>
> There is still need for some wrapper functions however, which are merged
> from xillybus_pcie.c and xillybus_of.c into xillybus_core.c:
>
> (1) The two xilly_sync_for_*() functions are necessary, because the
> calls to the respective dma_sync_single_for_*() must be avoided on
> Xilinx Zynq-7000 chips iff the Xillybus device is connected
> through the ACP port, hence performing the device's DMA operations
> coherently. Since it's also possible to connect the device to a
> non-coherent port, the choice is conveyed to the driver through the
> device tree.
>
> (2) The call to dma_map_single() is wrapped by a function that uses the
> Managed Device (devres) framework, in the absence of a relevant
> function in the current kernel's API.
>
> Suggested-by: Christophe JAILLET <christophe.jaillet@xxxxxxxxxx>
> Signed-off-by: Eli Billauer <eli.billauer@xxxxxxxxx>

Very nice cleanup, just one comment:

> +/*
> + * Xilinx' Zynq-7000 allows connecting the device through a coherent DMA
> + * port as well as non-coherent ports. The @make_sync_calls entry is therefore
> + * used to keep track of whether cache synchronization is required. Hence the
> + * two wrapper functions below.
> + */
> +
> +static void xilly_sync_for_cpu(struct xilly_endpoint *ep,
> + dma_addr_t dma_handle,
> + size_t size,
> + int direction)
> +{
> + if (ep->make_sync_calls)
> + dma_sync_single_for_cpu(ep->dev, dma_handle,
> + size, direction);
> +}
> +
> +static void xilly_sync_for_device(struct xilly_endpoint *ep,
> + dma_addr_t dma_handle,
> + size_t size,
> + int direction)
> +{
> + if (ep->make_sync_calls)
> + dma_sync_single_for_device(ep->dev, dma_handle,
> + size, direction);
> +}

These wrappers should not even be needed. When the device is
marked as coherent in DT, the dma_sync_*() calls are supposed
to do nothing, in a relatively efficient way. I would not expect
the extra conditional to give you any measurable performance
benefit over what you get normally, and it should not make a
functional difference either.

Can you remove the inlines and the ->make_sync_calls flag?

Arnd