Re: [PATCH 1/3] arm64: extend ioremap for cacheable non-shareable memory

From: Mark Rutland
Date: Fri Apr 21 2017 - 05:12:44 EST


Hi,

I notice you missed Catalin and Will from Cc. In future, please ensure
that you Cc them when altering arm64 arch code.

On Thu, Apr 20, 2017 at 03:34:16PM -0400, Haiying Wang wrote:
> NXP arm64 based SoC needs to allocate cacheable and
> non-shareable memory for the software portals of
> Queue manager, so we extend the arm64 ioremap support
> for this memory attribute.

NAK to this patch.

It is not possible to safely use Non-Shareable attributes in Linux page
tables, given that these page tables are shared by all PEs (i.e. CPUs).

My understanding is that if several PEs map a region as Non-Shareable,
the usual background behaviour of the PEs (e.g. speculation,
prefetching, natural eviction) mean that uniprocessor semantics are not
guaranteed (i.e. a read following a write may see stale data).

For example, in a system like:

+------+ +------+
| PE-a | | PE-b |
+------+ +------+
| L1-a | | L1-b |
+------+ +------+
|| ||
+----------------+
| Shared cache |
+----------------+
||
+----------------+
| Memory |
+----------------+

... you could have a sequence like:

1) PE-a allocates a line into L1-a for address X in preparation for a
store.

2) PE-b allocates a line into L1-b for the same address X as a result of
speculation.

3) PE-a makes a store to the line in L1-a. Since address X is mapped as
Non-shareable, no snoops are performed to keep other copies of the
line in sync.

4) As a result of explicit maintenance or as a natural eviction, L1-a
evicts its line into shared cache. The shared cache subsequently
evicts this to memory.

5) L1-b evicts its line to shared cache as a natural eviction.

6) L1-a fetches the line from shared cache in response to a load by
PE-a, returning stale data (i.e. the store is lost).

No amount of cache maintenance can avoid this. In general, Non-Shareable
mappings are a bad idea.

Thanks,
Mark.

> Signed-off-by: Haiying Wang <Haiying.Wang@xxxxxxx>
> ---
> arch/arm64/include/asm/io.h | 1 +
> arch/arm64/include/asm/pgtable-prot.h | 1 +
> 2 files changed, 2 insertions(+)
>
> diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h
> index 0c00c87..b6f03e7 100644
> --- a/arch/arm64/include/asm/io.h
> +++ b/arch/arm64/include/asm/io.h
> @@ -170,6 +170,7 @@ extern void __iomem *ioremap_cache(phys_addr_t phys_addr, size_t size);
> #define ioremap_nocache(addr, size) __ioremap((addr), (size), __pgprot(PROT_DEVICE_nGnRE))
> #define ioremap_wc(addr, size) __ioremap((addr), (size), __pgprot(PROT_NORMAL_NC))
> #define ioremap_wt(addr, size) __ioremap((addr), (size), __pgprot(PROT_DEVICE_nGnRE))
> +#define ioremap_cache_ns(addr, size) __ioremap((addr), (size), __pgprot(PROT_NORMAL_NS))
> #define iounmap __iounmap
>
> /*
> diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h
> index 2142c77..7fc7910 100644
> --- a/arch/arm64/include/asm/pgtable-prot.h
> +++ b/arch/arm64/include/asm/pgtable-prot.h
> @@ -42,6 +42,7 @@
> #define PROT_NORMAL_NC (PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_DIRTY | PTE_WRITE | PTE_ATTRINDX(MT_NORMAL_NC))
> #define PROT_NORMAL_WT (PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_DIRTY | PTE_WRITE | PTE_ATTRINDX(MT_NORMAL_WT))
> #define PROT_NORMAL (PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_DIRTY | PTE_WRITE | PTE_ATTRINDX(MT_NORMAL))
> +#define PROT_NORMAL_NS (PTE_TYPE_PAGE | PTE_AF | PTE_PXN | PTE_UXN | PTE_DIRTY | PTE_WRITE | PTE_ATTRINDX(MT_NORMAL))
>
> #define PROT_SECT_DEVICE_nGnRE (PROT_SECT_DEFAULT | PMD_SECT_PXN | PMD_SECT_UXN | PMD_ATTRINDX(MT_DEVICE_nGnRE))
> #define PROT_SECT_NORMAL (PROT_SECT_DEFAULT | PMD_SECT_PXN | PMD_SECT_UXN | PMD_ATTRINDX(MT_NORMAL))
> --
> 2.7.4
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel