Re: [PATCH v3 bpf-next 5/8] bpf: use module_alloc_huge for bpf_prog_pack

From: Edgecombe, Rick P
Date: Fri May 20 2022 - 23:20:37 EST


On Fri, 2022-05-20 at 18:00 -0700, Luis Chamberlain wrote:
> although VM_FLUSH_RESET_PERMS is rather new my concern here is we're
> essentially enabling sloppy users to grow without also addressing
> what if we have to take the leash back to support
> VM_FLUSH_RESET_PERMS
> properly? If the hack to support this on other architectures other
> than
> x86 is as simple as the one you in vm_remove_mappings() today:
>
> if (flush_reset &&
> !IS_ENABLED(CONFIG_ARCH_HAS_SET_DIRECT_MAP)) {
> set_memory_nx(addr, area->nr_pages);
> set_memory_rw(addr, area->nr_pages);
> }
>
> then I suppose this isn't a big deal. I'm just concerned here this
> being
> a slippery slope of sloppiness leading to something which we will
> regret later.
>
> My intution tells me this shouldn't be a big issue, but I just want
> to
> confirm.

Yea, I commented the same concern on the last thread:

https://lore.kernel.org/lkml/83a69976cb93e69c5ad7a9511b5e57c402eee19d.camel@xxxxxxxxx/

Song said he plans to make kprobes and ftrace work with this new
allocator. If that happens VM_FLUSH_RESET_PERMS would only have one
user - modules. Care to chime in with your plans for modules? If there
are actual near term plans to keep working on this,
VM_FLUSH_RESET_PERMS might be changed again or turn into something
else. Like if we are about to re-think everything, then it doesn't
matter as much to fix what would then be old.

Besides not fixing VM_FLUSH_RESET_PERMS/hibernate though, I think this
allocator still feels a little rough. For example I don't think we
actually know how much the huge mappings are helping. It is also
allocating memory in a big chunk from a single node and reusing it,
where before we were allocating based on numa node for each jit. Would
some user's suffer from that? Maybe it's obvious to others, but I would
have expected to see more discussion of MM things like that.

But I like general direction of caching and using text_poke() to write
the jits a lot. However it works, it seems to make a big impact in at
least some workloads.

So yea, seems sloppy, but probably (...I guess?) more good for users
then sloppy for us.