Re: [PATCH 2/2] riscv: Use PUD/PGDIR entries for linear mapping when possible

From: Alex Ghiti
Date: Sun Jun 21 2020 - 05:40:30 EST


Hi Atish,

Le 6/20/20 Ã 5:04 AM, Alex Ghiti a ÃcritÂ:
Hi Atish,

Le 6/19/20 Ã 2:16 PM, Atish Patra a ÃcritÂ:
On Thu, Jun 18, 2020 at 9:28 PM Alex Ghiti <alex@xxxxxxxx> wrote:
Hi Atish,

Le 6/18/20 Ã 8:47 PM, Atish Patra a Ãcrit :
On Wed, Jun 3, 2020 at 8:38 AM Alexandre Ghiti <alex@xxxxxxxx> wrote:
Improve best_map_size so that PUD or PGDIR entries are used for linear
mapping when possible as it allows better TLB utilization.

Signed-off-by: Alexandre Ghiti <alex@xxxxxxxx>
---
ÂÂ arch/riscv/mm/init.c | 45 +++++++++++++++++++++++++++++++++-----------
ÂÂ 1 file changed, 34 insertions(+), 11 deletions(-)

diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index 9a5c97e091c1..d275f9f834cf 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -424,13 +424,29 @@ static void __init create_pgd_mapping(pgd_t *pgdp,
ÂÂÂÂÂÂÂÂÂ create_pgd_next_mapping(nextp, va, pa, sz, prot);
ÂÂ }

-static uintptr_t __init best_map_size(phys_addr_t base, phys_addr_t size)
+static bool is_map_size_ok(uintptr_t map_size, phys_addr_t base,
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ uintptr_t base_virt, phys_addr_t size)
ÂÂ {
-ÂÂÂÂÂÂ /* Upgrade to PMD_SIZE mappings whenever possible */
-ÂÂÂÂÂÂ if ((base & (PMD_SIZE - 1)) || (size & (PMD_SIZE - 1)))
-ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ return PAGE_SIZE;
+ÂÂÂÂÂÂ return !((base & (map_size - 1)) || (base_virt & (map_size - 1)) ||
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ (size < map_size));
+}
+
+static uintptr_t __init best_map_size(phys_addr_t base, uintptr_t base_virt,
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ phys_addr_t size)
+{
+#ifndef __PAGETABLE_PMD_FOLDED
+ÂÂÂÂÂÂ if (is_map_size_ok(PGDIR_SIZE, base, base_virt, size))
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ return PGDIR_SIZE;
+
+ÂÂÂÂÂÂ if (pgtable_l4_enabled)
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ if (is_map_size_ok(PUD_SIZE, base, base_virt, size))
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ return PUD_SIZE;
+#endif
+
+ÂÂÂÂÂÂ if (is_map_size_ok(PMD_SIZE, base, base_virt, size))
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ return PMD_SIZE;

-ÂÂÂÂÂÂ return PMD_SIZE;
+ÂÂÂÂÂÂ return PAGE_SIZE;
ÂÂ }

ÂÂ /*
@@ -576,7 +592,7 @@ void create_kernel_page_table(pgd_t *pgdir, uintptr_t map_size)
ÂÂ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
ÂÂ {
ÂÂÂÂÂÂÂÂÂ uintptr_t va, end_va;
-ÂÂÂÂÂÂ uintptr_t map_size = best_map_size(load_pa, MAX_EARLY_MAPPING_SIZE);
+ÂÂÂÂÂÂ uintptr_t map_size;

ÂÂÂÂÂÂÂÂÂ load_pa = (uintptr_t)(&_start);
ÂÂÂÂÂÂÂÂÂ load_sz = (uintptr_t)(&_end) - load_pa;
@@ -587,6 +603,7 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)

ÂÂÂÂÂÂÂÂÂ kernel_virt_addr = KERNEL_VIRT_ADDR;

+ÂÂÂÂÂÂ map_size = best_map_size(load_pa, PAGE_OFFSET, MAX_EARLY_MAPPING_SIZE);
ÂÂÂÂÂÂÂÂÂ va_pa_offset = PAGE_OFFSET - load_pa;
ÂÂÂÂÂÂÂÂÂ va_kernel_pa_offset = kernel_virt_addr - load_pa;
ÂÂÂÂÂÂÂÂÂ pfn_base = PFN_DOWN(load_pa);
@@ -700,6 +717,8 @@ static void __init setup_vm_final(void)

ÂÂÂÂÂÂÂÂÂ /* Map all memory banks */
ÂÂÂÂÂÂÂÂÂ for_each_memblock(memory, reg) {
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ uintptr_t remaining_size;
+
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ start = reg->base;
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ end = start + reg->size;

@@ -707,15 +726,19 @@ static void __init setup_vm_final(void)
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ break;
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ if (memblock_is_nomap(reg))
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ continue;
-ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ if (start <= __pa(PAGE_OFFSET) &&
-ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ __pa(PAGE_OFFSET) < end)
-ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ start = __pa(PAGE_OFFSET);

-ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ map_size = best_map_size(start, end - start);
-ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ for (pa = start; pa < end; pa += map_size) {
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ pa = start;
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ remaining_size = reg->size;
+
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ while (remaining_size) {
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ va = (uintptr_t)__va(pa);
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ map_size = best_map_size(pa, va, remaining_size);
+
create_pgd_mapping(swapper_pg_dir, va, pa,
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ map_size, PAGE_KERNEL);
+
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ pa += map_size;
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ remaining_size -= map_size;
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ }
ÂÂÂÂÂÂÂÂÂ }

This may not work in the RV32 with 2G memory and if the map_size is
determined to be a page size
for the last memblock. Both pa & remaining_size will overflow and the
loop will try to map memory from zero again.
I'm not sure I understand: if pa starts at 0x8000_0000 and size is 2G,
then pa will overflow in the last iteration, but remaining_size will
then be equal to 0 right ?

Not unless the remaining_size is at least page size aligned. The last
remaining size would "fff".
It will overflow as well after subtracting the map_size.


While fixing this issue, I noticed that if the size in the device tree is not aligned on PAGE_SIZE, the size is then automatically realigned on PAGE_SIZE: see early_init_dt_add_memory_arch where size is and-ed with PAGE_MASK to remove the unaligned part.

So the issue does not need to be fixed :)

Thanks anyway,

Alex



And by the way, I realize that this loop only handles sizes that are
aligned on map_size.

Yeah.


Thanks for noticing, I send a v2.

Alex



Thanks,

Alex


--
2.20.1