Re: [PATCH 2/3] arm64: use PAGE_KERNEL_ROX directly in alloc_insn_page

From: Peter Zijlstra
Date: Tue Jun 23 2020 - 05:37:42 EST


On Tue, Jun 23, 2020 at 10:07:58AM +0100, Will Deacon wrote:
> On Tue, Jun 23, 2020 at 11:05:05AM +0200, Christoph Hellwig wrote:
> > On Sat, Jun 20, 2020 at 07:16:16PM -0700, Andrew Morton wrote:
> > > On Thu, 18 Jun 2020 08:43:06 +0200 Christoph Hellwig <hch@xxxxxx> wrote:
> > >
> > > > Use PAGE_KERNEL_ROX directly instead of allocating RWX and setting the
> > > > page read-only just after the allocation.
> > > >
> > > > --- a/arch/arm64/kernel/probes/kprobes.c
> > > > +++ b/arch/arm64/kernel/probes/kprobes.c
> > > > @@ -120,15 +120,9 @@ int __kprobes arch_prepare_kprobe(struct kprobe *p)
> > > >
> > > > void *alloc_insn_page(void)
> > > > {
> > > > - void *page;
> > > > -
> > > > - page = vmalloc_exec(PAGE_SIZE);
> > > > - if (page) {
> > > > - set_memory_ro((unsigned long)page, 1);
> > > > - set_vm_flush_reset_perms(page);
> > > > - }
> > > > -
> > > > - return page;
> > > > + return __vmalloc_node_range(PAGE_SIZE, 1, VMALLOC_START, VMALLOC_END,
> > > > + GFP_KERNEL, PAGE_KERNEL_ROX, VM_FLUSH_RESET_PERMS,
> > > > + NUMA_NO_NODE, __func__);
> > > > }
> > > >
> > > > /* arm kprobe: install breakpoint in text */
> > >
> > > But why. I think this is just a cleanup, doesn't address any runtime issue?
> >
> > It doesn't "fix" an issue - it just simplifies and speeds up the code.
>
> Ok, but I don't understand the PLT comment from Peter in
> 20200618092754.GF576905@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx:
>
> | I think this has the exact same range issue as the x86 user. But it
> | might be less fatal if their PLT magic can cover the full range.
>
> Peter, please could you elaborate on your concern? I feel like I'm missing
> some context.

On x86 we can only directly call code in a (signed) 32bit immediate
range (2G) and our kernel text and module range are constrained by that.

IIRC ARM64 has an even smaller immediate range and needs to play fixup
games with trampolines or somesuch (there was an ARM specific name for
it that I've misplaced again). Does that machinery cover the entire
vmalloc space or are you only able to fix up for a smaller range?

Your arch/arm64/kernel/module.c:module_alloc() implementation seems to
have an explicit module range different from the full vmalloc range, I'm
thinking this is for a reason.