Re: [PATCH RFC v1] arm64: mm: change mem_map to use block/section mapping with crashkernel

From: Catalin Marinas
Date: Wed Apr 13 2022 - 12:53:46 EST


On Tue, Apr 12, 2022 at 05:07:56PM +0800, Guanghui Feng wrote:
> There are many changes and discussions:
> commit 031495635b46
> commit 1a8e1cef7603
> commit 8424ecdde7df
> commit 0a30c53573b0
> commit 2687275a5843
>
> When using DMA/DMA32 zone and crashkernel, disable rodata full and kfence,
> mem_map will use non block/section mapping(for crashkernel requires to shrink
> the region in page granularity). But it will degrade performance when doing
> larging continuous mem access in kernel(memcpy/memmove, etc).
>
> This patch firstly do block/section mapping at mem_map, reserve crashkernel
> memory. And then walking pagetable to split block/section mapping
> to non block/section mapping [only] for crashkernel mem. We will accelerate
> mem access about 10-20% performance improvement, and reduce the cpu dTLB miss
> conspicuously on some platform with this optimization.

Do you actually have some real world use-cases where this improvement
matters? I don't deny that large memcpy over the kernel linear map may
be slightly faster but where does this really matter?

> +static void init_crashkernel_pmd(pud_t *pudp, unsigned long addr,
> + unsigned long end, phys_addr_t phys,
> + pgprot_t prot,
> + phys_addr_t (*pgtable_alloc)(int), int flags)
> +{
> + phys_addr_t map_offset;
> + unsigned long next;
> + pmd_t *pmdp;
> + pmdval_t pmdval;
> +
> + pmdp = pmd_offset(pudp, addr);
> + do {
> + next = pmd_addr_end(addr, end);
> + if (!pmd_none(*pmdp) && pmd_sect(*pmdp)) {
> + phys_addr_t pte_phys = pgtable_alloc(PAGE_SHIFT);
> + pmd_clear(pmdp);
> + pmdval = PMD_TYPE_TABLE | PMD_TABLE_UXN;
> + if (flags & NO_EXEC_MAPPINGS)
> + pmdval |= PMD_TABLE_PXN;
> + __pmd_populate(pmdp, pte_phys, pmdval);
> + flush_tlb_kernel_range(addr, addr + PAGE_SIZE);

The architecture requires us to do a break-before-make here, so
pmd_clear(), TLBI, __pmd_populate() - in this order. And that's where it
gets tricky, if the kernel happens to access this pmd range while it is
unmapped, you'd get a translation fault.

--
Catalin