Re: [v3 PATCH 0/6] arm64: support FEAT_BBM level 2 and large block mapping when rodata=full

From: Yang Shi
Date: Mon Jun 02 2025 - 16:55:20 EST




On 6/2/25 3:47 AM, Ryan Roberts wrote:
On 30/05/2025 18:18, Yang Shi wrote:

On 5/30/25 12:59 AM, Ryan Roberts wrote:
On 29/05/2025 21:52, Yang Shi wrote:
The split_mapping() guarantees keep block mapping if it is fully
contained in
the range between start and end, this is my series's responsibility. I
know
the
current code calls apply_to_page_range() to apply permission change and it
just
does it on PTE basis. So IIUC Dev's series will modify it or provide a new
API,
then __change_memory_common() will call it to change permission. There
should be
some overlap between mine and Dev's, but I don't see strong dependency.
But if you have a block mapping in the region you are calling
__change_memory_common() on, today that will fail because it can only
handle
page mappings.
IMHO letting __change_memory_common() manipulate on contiguous address
range is
another story and should be not a part of the split primitive.
I 100% agree that it should not be part of the split primitive.

But your series *depends* upon __change_memory_common() being able to change
permissions on block mappings. Today it can only change permissions on page
mappings.
I don't think split primitive depends on it. Changing permission on block
mappings is just the user of the new split primitive IMHO. We just have no
real
user right now.
But your series introduces a real user; after your series, the linear map is
block mapped.
The users of the split primitive are the permission changers, for example,
module, bpf, secret mem, etc.
Ahh, perhaps this is the crux of our misunderstanding... In my model, the split
primitive is called from __change_memory_common() (or from other appropriate
functions in pageattr.c). It's an implementation detail for arm64 and is not
exposed to common code. arm64 knows that it can split live mappings in a
transparent way so it uses huge pages eagerly and splits on demand.

I personally wouldn't want to be relying on the memory user knowing it needs to
split the mappings...
We are actually on the same page...

For example, when loading module, kernel currently does:

vmalloc() // Allocate memory for module
module_enable_text_rox() // change permission to ROX for text section
    set_memory_x
        change_memory_common
            for every page in the vmalloc area
                __change_memory_common(addr, PAGE_SIZE, ...) // page basis
                    split_mapping(addr, addr + PAGE_SIZE)
                    apply_to_page_range() // apply the new permission

__change_memory_common() has to be called on page basis because we don't know
whether the pages for the vmalloc area are contiguous or not. So the split
primitive is called on page basis.
Yes that makes sense for the case where we are setting permissions on a
virtually contiguous region of vmalloc space; in that case we must set
permissions on the linear map page-by-page. Agreed.

I was thinking of the cases where we are changing the permissions on a virtually
contiguous region of the *linear map*. Although looking again at the code, it
seems there aren't as many places as I thought that actually do this. I think
set_direct_map_valid_noflush() is the only one that will operate on multiple
pages of the linear map at a time. But this single case means that you could end
up wanting to change permissions on a large block mapping and therefore need
Dev's work, right?

Yes, set_direct_map_valid_noflush() may be called on multiple pages. But this was introduced by Mike Rapport's "x86/module: use large ROX pages for text allocations" series. So just x86 supports it right now. Large execmem ROX cache is not supported on arm64 yet IIRC.

Thanks,
Yang



Thanks,
Ryan


So we need do the below in order to keep large mapping:
check whether the vmalloc area is huge mapped (PMD/CONT PMD/CONT PTE) or not
if (it is huge mapped)
    __change_memory_common(addr, HUGE_SIZE, ...)
        split_mapping(addr, addr + HUGE_SIZE)
        change permission on (addr, addr + HUGE_SIZE)
else
    fallback to page basis


To have huge mapping for vmalloc, we need use vmalloc_huge() or the new
implementation proposed by you to allocate memory for module in the first place.
This is the "user" in my understanding.

Thanks,
Yang