Re: [PATCH v5 bpf-next 1/5] module: introduce module_alloc_huge

From: Luis Chamberlain
Date: Thu Jul 07 2022 - 16:11:48 EST


On Wed, Jul 06, 2022 at 04:39:13AM +0000, Song Liu wrote:
> > On Jul 1, 2022, at 4:20 PM, Luis Chamberlain <mcgrof@xxxxxxxxxx> wrote:
> > On Fri, Jun 24, 2022 at 02:57:08PM -0700, Song Liu wrote:
> >> +void *module_alloc_huge(unsigned long size)
> >> +{
> >> + gfp_t gfp_mask = GFP_KERNEL;
> >> + void *p;
> >> +
> >> + if (PAGE_ALIGN(size) > MODULES_LEN)
> >> + return NULL;
> >> +
> >> + p = __vmalloc_node_range(size, MODULE_ALIGN,
> >> + MODULES_VADDR + get_module_load_offset(),
> >> + MODULES_END, gfp_mask, PAGE_KERNEL,
> >> + VM_DEFER_KMEMLEAK | VM_ALLOW_HUGE_VMAP,
> >> + NUMA_NO_NODE, __builtin_return_address(0));
> >> + if (p && (kasan_alloc_module_shadow(p, size, gfp_mask) < 0)) {
> >> + vfree(p);
> >> + return NULL;
> >> + }
> >> +
> >> + return p;
> >> +}
> >
> > 1) When things like kernel/bpf/core.c start using a module alloc it
> > is time to consider genearlizing this.
>
> I am not quite sure what the generalization would look like. IMHO, the
> ideal case would have:
> a) A kernel_text_rw_allocator, similar to current module_alloc.
> b) A kernel_text_ro_allocator, similar to current bpf_prog_pack_alloc.
> This is built on top of kernel_text_rw_allocator. Different
> allocations could share a page, thus it requires text_poke like
> support from the arch.
> c) If the arch supports text_poke, kprobes, ftrace trampolines, and
> bpf trampolines should use kernel_text_ro_allocator.
> d) Major archs should support CONFIG_ARCH_WANTS_MODULES_DATA_IN_VMALLOC,
> and they should use kernel_text_ro_allocator for module text.
>
> Does this sound reasonable to you?

Yes, and a respective free call may have an arch specific stuff which
removes exec stuff.

In so far as the bikeshedding, I do think this is generic so
vmalloc_text_*() suffices or vmalloc_exec_*() take your pick for
a starter and let the world throw in their preference.

> I tried to enable CONFIG_ARCH_WANTS_MODULES_DATA_IN_VMALLOC for x86_64,
> but that doesn't really work. Do we have plan to make this combination
> work?

Oh nice.

Good stuff. Perhaps it just requires a little love from mm folks.
Don't beat yourself up if it does not yet. We can work towards that
later.

> > 2) How we free is important, and each arch does something funky for
> > this. This is not addressed here.
>
> How should we address this? IIUC, x86_64 just calls vfree.

That's not the case for all archs is it? I'm talking about the generic
module_alloc() too. I'd like to see that go the way we discussed above.

> > And yes I welcome generalizing generic module_alloc() too as suggested
> > before. The concern on my part is the sloppiness this enables.
>
> One question I have is, does module_alloc (or kernel_text_*_allocator
> above) belong to module code, or mm code (maybe vmalloc)?

The evolution of its uses indicates it is growing outside of modules and
so mm should be the new home.

> I am planning to let BPF trampoline use bpf_prog_pack on x86_64, which
> is another baby step of c) above.

OK!

Luis