Re: [PATCH] riscv: Fix memblock reservation for device tree blob

From: Albert Ou
Date: Tue Oct 08 2019 - 19:31:22 EST


On 2019-10-08 15:38:15 -0700, Palmer Dabbelt <palmer@xxxxxxxxxx> wrote:
> On Fri, 20 Sep 2019 21:34:57 PDT (-0700), anup@xxxxxxxxxxxxxx wrote:
> > On Sat, Sep 21, 2019 at 6:30 AM Albert Ou <aou@xxxxxxxxxxxxxxxxx> wrote:
> >>
> >> This fixes an error with how the FDT blob is reserved in memblock.
> >> An incorrect physical address calculation exposed the FDT header to
> >> unintended corruption, which typically manifested with of_fdt_raw_init()
> >> faulting during late boot after fdt_totalsize() returned a wrong value.
> >> Systems with smaller physical memory sizes more frequently trigger this
> >> issue, as the kernel is more likely to allocate from the DMA32 zone
> >> where bbl places the DTB after the kernel image.
> >>
> >> Commit 671f9a3e2e24 ("RISC-V: Setup initial page tables in two stages")
> >> changed the mapping of the DTB to reside in the fixmap area.
> >> Consequently, early_init_fdt_reserve_self() cannot be used anymore in
> >> setup_bootmem() since it relies on __pa() to derive a physical address,
> >> which does not work with dtb_early_va that is no longer a valid kernel
> >> logical address.
> >>
> >> The reserved[0x1] region shows the effect of the pointer underflow
> >> resulting from the __pa(initial_boot_params) offset subtraction:
> >>
> >> [ 0.000000] MEMBLOCK configuration:
> >> [ 0.000000] memory size = 0x000000001fe00000 reserved size = 0x0000000000a2e514
> >> [ 0.000000] memory.cnt = 0x1
> >> [ 0.000000] memory[0x0] [0x0000000080200000-0x000000009fffffff], 0x000000001fe00000 bytes flags: 0x0
> >> [ 0.000000] reserved.cnt = 0x2
> >> [ 0.000000] reserved[0x0] [0x0000000080200000-0x0000000080c2dfeb], 0x0000000000a2dfec bytes flags: 0x0
> >> [ 0.000000] reserved[0x1] [0xfffffff080100000-0xfffffff080100527], 0x0000000000000528 bytes flags: 0x0
> >>
> >> With the fix applied:
> >>
> >> [ 0.000000] MEMBLOCK configuration:
> >> [ 0.000000] memory size = 0x000000001fe00000 reserved size = 0x0000000000a2e514
> >> [ 0.000000] memory.cnt = 0x1
> >> [ 0.000000] memory[0x0] [0x0000000080200000-0x000000009fffffff], 0x000000001fe00000 bytes flags: 0x0
> >> [ 0.000000] reserved.cnt = 0x2
> >> [ 0.000000] reserved[0x0] [0x0000000080200000-0x0000000080c2dfeb], 0x0000000000a2dfec bytes flags: 0x0
> >> [ 0.000000] reserved[0x1] [0x0000000080e00000-0x0000000080e00527], 0x0000000000000528 bytes flags: 0x0
> >
> > Thanks for catching this issue.
> >
> > Most of us did not notice this issue most likely because:
> > 1. We generally have good enough RAM on QEMU and SiFive Unleashed
> > 2. Most of people use OpenSBI FW_JUMP on QEMU and U-Boot on
> > SiFive Unleashed to boot in Linux which places FDT quite far away
> > from Linux kernel end
> >
> > Linux ARM64 kernel also uses FIXMAP to access FDT and over there
> > as well early_init_fdt_reserve_self() is not used.
> >
> >>
> >> Fixes: 671f9a3e2e24 ("RISC-V: Setup initial page tables in two stages")
> >> Signed-off-by: Albert Ou <aou@xxxxxxxxxxxxxxxxx>
> >> ---
> >> arch/riscv/mm/init.c | 13 ++++++++++++-
> >> 1 file changed, 12 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
> >> index f0ba713..52d007c 100644
> >> --- a/arch/riscv/mm/init.c
> >> +++ b/arch/riscv/mm/init.c
> >> @@ -11,6 +11,7 @@
> >> #include <linux/swap.h>
> >> #include <linux/sizes.h>
> >> #include <linux/of_fdt.h>
> >> +#include <linux/libfdt.h>
> >>
> >> #include <asm/fixmap.h>
> >> #include <asm/tlbflush.h>
> >> @@ -82,6 +83,8 @@ static void __init setup_initrd(void)
> >> }
> >> #endif /* CONFIG_BLK_DEV_INITRD */
> >>
> >> +static phys_addr_t __dtb_pa __initdata;
> >
> > May be dtb_early_pa will be more consistent name
> > instead of __dtb_pa because it matches dtb_early_va
> > used below.
> >
> >> +
> >> void __init setup_bootmem(void)
> >> {
> >> struct memblock_region *reg;
> >> @@ -117,7 +120,12 @@ void __init setup_bootmem(void)
> >> setup_initrd();
> >> #endif /* CONFIG_BLK_DEV_INITRD */
> >>
> >> - early_init_fdt_reserve_self();
> >> + /*
> >> + * Avoid using early_init_fdt_reserve_self() since __pa() does
> >> + * not work for DTB pointers that are fixmap addresses
> >> + */
> >> + memblock_reserve(__dtb_pa, fdt_totalsize(dtb_early_va));
> >> +
> >> early_init_fdt_scan_reserved_mem();
> >> memblock_allow_resize();
> >> memblock_dump_all();
> >> @@ -333,6 +341,7 @@ static uintptr_t __init best_map_size(phys_addr_t base, phys_addr_t size)
> >> "not use absolute addressing."
> >> #endif
> >>
> >> +
> >
> > Please remove this newline addition.
> >
> >> asmlinkage void __init setup_vm(uintptr_t dtb_pa)
> >> {
> >> uintptr_t va, end_va;
> >> @@ -393,6 +402,8 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
> >>
> >> /* Save pointer to DTB for early FDT parsing */
> >> dtb_early_va = (void *)fix_to_virt(FIX_FDT) + (dtb_pa & ~PAGE_MASK);
> >> + /* Save physical address for memblock reservation */
> >> + __dtb_pa = dtb_pa;
> >> }
> >>
> >> static void __init setup_vm_final(void)
> >> --
> >> 2.7.4
> >>
> >>
> >> _______________________________________________
> >> linux-riscv mailing list
> >> linux-riscv@xxxxxxxxxxxxxxxxxxx
> >> http://lists.infradead.org/mailman/listinfo/linux-riscv
> >
> > This deserves to be stable kernel fix as well.
> > You should add:
> > Cc: stable@xxxxxxxxxxxxxxx
> > in your commit description.
> >
> > Apart from minor nits above.
> >
> > Reviewed-by: Anup Patel <anup@xxxxxxxxxxxxxx>
> >
> > I tried this patch for both RV64 and RV32 on QEMU with
> > Yocto rootfs.
> >
> > Tested-by: Anup Patel <anup@xxxxxxxxxxxxxx>
> >
> > Regards,
> > Anup
>
> Albert: Do you plan on spinning a v2 of the patch set?
>

v2 was sent last week and has already been applied as
922b0375fc93fb1a20c5617e37c389c26bbccb70 by Paul.

--
Albert Ou