Re: [PATCH v5 6/6] soc: renesas: Add L2 cache management for RZ/Five SoC

From: Samuel Holland
Date: Tue Dec 27 2022 - 22:24:32 EST


On 12/12/22 05:55, Prabhakar wrote:
> From: Lad Prabhakar <prabhakar.mahadev-lad.rj@xxxxxxxxxxxxxx>
>
> I/O Coherence Port (IOCP) provides an AXI interface for connecting
> external non-caching masters, such as DMA controllers. The accesses
> from IOCP are coherent with D-Caches and L2 Cache.
>
> IOCP is a specification option and is disabled on the Renesas RZ/Five
> SoC due to this reason IP blocks using DMA will fail.
>
> The Andes AX45MP core has a Programmable Physical Memory Attributes (PMA)
> block that allows dynamic adjustment of memory attributes in the runtime.
> It contains a configurable amount of PMA entries implemented as CSR
> registers to control the attributes of memory locations in interest.
> Below are the memory attributes supported:
> * Device, Non-bufferable
> * Device, bufferable
> * Memory, Non-cacheable, Non-bufferable
> * Memory, Non-cacheable, Bufferable
> * Memory, Write-back, No-allocate
> * Memory, Write-back, Read-allocate
> * Memory, Write-back, Write-allocate
> * Memory, Write-back, Read and Write-allocate
>
> More info about PMA (section 10.3):
> Link: http://www.andestech.com/wp-content/uploads/AX45MP-1C-Rev.-5.0.0-Datasheet.pdf
>
> As a workaround for SoCs with IOCP disabled CMO needs to be handled by
> software. Firstly OpenSBI configures the memory region as
> "Memory, Non-cacheable, Bufferable" and passes this region as a global
> shared dma pool as a DT node. With DMA_GLOBAL_POOL enabled all DMA
> allocations happen from this region and synchronization callbacks are
> implemented to synchronize when doing DMA transactions.
>
> Example PMA region passes as a DT node from OpenSBI:
> reserved-memory {
> #address-cells = <2>;
> #size-cells = <2>;
> ranges;
>
> pma_resv0@58000000 {
> compatible = "shared-dma-pool";
> reg = <0x0 0x58000000 0x0 0x08000000>;
> no-map;
> linux,dma-default;
> };
> };
>
> Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@xxxxxxxxxxxxxx>
> ---
> v4 -> v5
> * Dropped code for configuring L2 cache
> * Dropped code for configuring PMA
> * Updated commit message
> * Added comments
> * Changed static branch enable/disable order
>
> RFC v3 -> v4
> * Made use of runtime patching instead of compile time
> * Now just exposing single function ax45mp_no_iocp_cmo() for CMO handling
> * Added a check to make sure cache line size is always 64 bytes
> * Renamed folder rzf -> rzfive
> * Improved Kconfig description
> * Dropped L2 cache configuration
> * Dropped unnecessary casts
> * Fixed comments pointed by Geert.
> ---
> arch/riscv/include/asm/cacheflush.h | 8 +
> arch/riscv/include/asm/errata_list.h | 28 ++-
> drivers/soc/renesas/Kconfig | 6 +
> drivers/soc/renesas/Makefile | 2 +
> drivers/soc/renesas/rzfive/Kconfig | 6 +
> drivers/soc/renesas/rzfive/Makefile | 3 +
> drivers/soc/renesas/rzfive/ax45mp_cache.c | 256 ++++++++++++++++++++++
> 7 files changed, 303 insertions(+), 6 deletions(-)
> create mode 100644 drivers/soc/renesas/rzfive/Kconfig
> create mode 100644 drivers/soc/renesas/rzfive/Makefile
> create mode 100644 drivers/soc/renesas/rzfive/ax45mp_cache.c

Thanks for the updates! This looks much cleaner and easier to understand
now that the driver is only trying to do one thing.

> diff --git a/drivers/soc/renesas/rzfive/Makefile b/drivers/soc/renesas/rzfive/Makefile
> new file mode 100644
> index 000000000000..2012e7fb978d
> --- /dev/null
> +++ b/drivers/soc/renesas/rzfive/Makefile
...
> +void ax45mp_no_iocp_cmo(unsigned int cache_size, void *vaddr, size_t size, int dir, int ops)
> +{
> + if (ops == NON_COHERENT_DMA_PREP)
> + return;
> +
> + if (!static_branch_unlikely(&ax45mp_l2c_configured))
> + return;
> +
> + if (ops == NON_COHERENT_SYNC_DMA_FOR_DEVICE) {
> + switch (dir) {
> + case DMA_FROM_DEVICE:
> + ax45mp_cpu_dma_inval_range(vaddr, size);
> + break;
> + case DMA_TO_DEVICE:
> + case DMA_BIDIRECTIONAL:
> + ax45mp_cpu_dma_wb_range(vaddr, size);
> + break;
> + default:
> + break;
> + }
> + return;
> + }
> +
> + /* op == NON_COHERENT_SYNC_DMA_FOR_CPU */
> + if (dir == DMA_BIDIRECTIONAL || dir == DMA_FROM_DEVICE)
> + ax45mp_cpu_dma_inval_range(vaddr, size);

I think this at least deserves a comment explaining why it differs from
the clean/flush/invalidate choices in arch/riscv/mm/dma-noncoherent.c.

Regards,
Samuel