Re: [PATCH] efi/arm: fix allocation failure when reserving the kernel base

From: Mike Rapoport
Date: Wed Aug 21 2019 - 04:29:41 EST


On Wed, Aug 21, 2019 at 10:29:37AM +0300, Ard Biesheuvel wrote:
> On Wed, 21 Aug 2019 at 10:11, Mike Rapoport <rppt@xxxxxxxxxxxxx> wrote:
> >
> > On Wed, Aug 21, 2019 at 09:35:16AM +0300, Ard Biesheuvel wrote:
> > > On Wed, 21 Aug 2019 at 09:11, Chester Lin <clin@xxxxxxxx> wrote:
> > > >
> > > > On Tue, Aug 20, 2019 at 03:28:25PM +0300, Ard Biesheuvel wrote:
> > > > > On Tue, 20 Aug 2019 at 14:56, Russell King - ARM Linux admin
> > > > > <linux@xxxxxxxxxxxxxxx> wrote:
> > > > > >
> > > > > > On Fri, Aug 02, 2019 at 05:38:54AM +0000, Chester Lin wrote:
> > > > > > > diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
> > > > > > > index f3ce34113f89..909b11ba48d8 100644
> > > > > > > --- a/arch/arm/mm/mmu.c
> > > > > > > +++ b/arch/arm/mm/mmu.c
> > > > > > > @@ -1184,6 +1184,9 @@ void __init adjust_lowmem_bounds(void)
> > > > > > > phys_addr_t block_start = reg->base;
> > > > > > > phys_addr_t block_end = reg->base + reg->size;
> > > > > > >
> > > > > > > + if (memblock_is_nomap(reg))
> > > > > > > + continue;
> > > > > > > +
> > > > > > > if (reg->base < vmalloc_limit) {
> > > > > > > if (block_end > lowmem_limit)
> > > > > > > /*
> > > > > >
> > > > > > I think this hunk is sane - if the memory is marked nomap, then it isn't
> > > > > > available for the kernel's use, so as far as calculating where the
> > > > > > lowmem/highmem boundary is, it effectively doesn't exist and should be
> > > > > > skipped.
> > > > > >
> > > > >
> > > > > I agree.
> > > > >
> > > > > Chester, could you explain what you need beyond this change (and my
> > > > > EFI stub change involving TEXT_OFFSET) to make things work on the
> > > > > RPi2?
> > > > >
> > > >
> > > > Hi Ard,
> > > >
> > > > In fact I am working with Guillaume to try booting zImage kernel and openSUSE
> > > > from grub2.04 + arm32-efistub so that's why we get this issue on RPi2, which is
> > > > one of the test machines we have. However we want a better solution for all
> > > > cases but not just RPi2 since we don't want to affect other platforms as well.
> > > >
> > >
> > > Thanks Chester, but that doesn't answer my question.
> > >
> > > Your fix is a single patch that changes various things that are only
> > > vaguely related. We have already identified that we need to take
> > > TEXT_OFFSET (minus some space used by the swapper page tables) into
> > > account into the EFI stub if we want to ensure compatibility with many
> > > different platforms, and as it turns out, this applies not only to
> > > RPi2 but to other platforms as well, most notably the ones that
> > > require a TEXT_OFFSET of 0x208000, since they also have reserved
> > > regions at the base of RAM.
> > >
> > > My question was what else we need beyond:
> > > - the EFI stub TEXT_OFFSET fix [0]
> > > - the change to disregard NOMAP memblocks in adjust_lowmem_bounds()
> > > - what else???
> >
> > I think the only missing part here is to ensure that non-reserved memory in
> > bank 0 starts from a PMD-aligned address. I believe this could be done if
> > EFI stub, but I'm not really familiar with it so this just a semi-educated
> > guess :)
> >
>
> Given that it is the ARM arch code that imposes this requirement, how
> about adding something like this to adjust_lowmem_bounds():
>
> if (memblock_start_of_DRAM() % PMD_SIZE)
> memblock_mark_nomap(memblock_start_of_DRAM(),
> PMD_SIZE - (memblock_start_of_DRAM() % PMD_SIZE));

memblock_start_of_DRAM() won't work here, as it returns the actual start of
the DRAM including NOMAP regions. Moreover, as we cannot mark a region
NOMAP inside for_each_memblock() this should be done beforehand.

I think something like this could work:

diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 2f0f07e..f2b635b 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -1178,6 +1178,19 @@ void __init adjust_lowmem_bounds(void)
*/
vmalloc_limit = (u64)(uintptr_t)vmalloc_min - PAGE_OFFSET + PHYS_OFFSET;

+ /*
+ * The first usable region must be PMD aligned. Mark its start
+ * as MEMBLOCK_NOMAP if it isn't
+ */
+ for_each_memblock(memory, reg) {
+ if (!memblock_is_nomap(reg) && (reg->base % PMD_SIZE)) {
+ phys_addr_t size = PMD_SIZE - (reg->base % PMD_SIZE);
+
+ memblock_mark_nomap(reg->base, size);
+ break;
+ }
+ }
+
for_each_memblock(memory, reg) {
phys_addr_t block_start = reg->base;
phys_addr_t block_end = reg->base + reg->size;




> (and introduce the nomap check into the loop)

--
Sincerely yours,
Mike.