Re: [PATCH v1] selftests/mm: add simple VM_PFNMAP tests based on mmap'ing /dev/mem

From: David Hildenbrand
Date: Fri May 09 2025 - 06:19:23 EST


On 09.05.25 11:49, Lorenzo Stoakes wrote:
On Fri, May 09, 2025 at 12:20:41AM +0200, David Hildenbrand wrote:
Let's test some basic functionality using /dev/mem. These tests will
implicitly cover some PAT (Page Attribute Handling) handling on x86.

Ah this is really nice thanks for this!

Thanks for your review!


ok 14 mprotect(PROT_NONE)
ok 15 SIGSEGV expected
ok 16 mprotect(PROT_READ)
ok 17 SIGSEGV not expected
ok 18 fork()
ok 19 SIGSEGV in child not expected
# Totals: pass:19 fail:0 xfail:0 xpass:0 skip:0 error:0

It'd be good to assert that merging doesn't work for VM_PFNMAP, though hm
one could argue that's not hugely useful as it's trivially implemented.

But I guess anything like that should live in merge.c.

I assume we'd need is_range_mapped() from mremap_tests.c.

Something for another day :)

[...]

+static void signal_handler(int sig)
+{
+ if (sig == SIGSEGV)
+ siglongjmp(env, 1);
+ siglongjmp(env, 2);
+}

Hm, wouldn't it be better to only catch these only if you specifically
meant to catch a signal?

I had that, but got tired about the repeated register + unregister, after all I really don't want to spend a lot more time on this.

You can see what I did in guard-regions.c for an example (sorry, I'm sure
you know exactly how the thing works, just I mean for an easy reminder :P)


Again, time is the limit. But let me see if I can get something done in a reasonable timeframe.

+
+static void sense_support(void)
+{

See below comment about the kselftest_harness, but with that you can
literally declare fixture setups/teardowns very nicely :) You can also
mmap() these 2 pages and munmap() them afterwards straightforwardly.

+ char *addr, tmp;
+ int ret;
+
+ dev_mem_fd = open("/dev/mem", O_RDONLY);
+ if (dev_mem_fd < 0)
+ ksft_exit_skip("Cannot open '/dev/mem': %s\n", strerror(errno));

Hm skip, or failure? Skip implies it's expected right? I suppose it's
possible a system might be setup without this...

Try as non-root or on a lockdowned system :)


+
+ /* We'll require the first two pages throughout our tests ... */
+ addr = mmap(0, pagesize * 2, PROT_READ, MAP_SHARED, dev_mem_fd, 0);
+ if (addr == MAP_FAILED)
+ ksft_exit_skip("Cannot mmap '/dev/mem'");
+
+ /* ... and want to be able to read from them. */
+ ret = sigsetjmp(env, 1);
+ if (!ret) {
+ tmp = *addr + *(addr + pagesize);
+ asm volatile("" : "+r" (tmp));

Is this not pretty much equivalent to a volatile read where you're forcing
the compiler to not optimise this unused thing away? In guard-regions I set:

#define FORCE_READ(x) (*(volatile typeof(x) *)x)

For this purpose, which would make this:

FORCE_READ(addr);
FORCE_READ(&addr[pagesize]);

Hmmm, a compiler might be allowed to optimize out a volatile read.


+ }
+ if (ret)
+ ksft_exit_skip("Cannot read-access mmap'ed '/dev/mem'");

Why are we returning 1 or 2 if we don't differentiate it here?

Copy-and-paste. As we are not registering for SIGBUS, we can just return 1.


+
+ munmap(addr, pagesize * 2);
+}
+
+static void test_madvise(void)
+{
+#define INIT_ADVICE(nr) { nr, #nr}
+ const struct {
+ int nr;
+ const char *name;
+ } advices[] = {
+ INIT_ADVICE(MADV_DONTNEED),
+ INIT_ADVICE(MADV_DONTNEED_LOCKED),
+ INIT_ADVICE(MADV_FREE),
+ INIT_ADVICE(MADV_WIPEONFORK),
+ INIT_ADVICE(MADV_COLD),
+ INIT_ADVICE(MADV_PAGEOUT),
+ INIT_ADVICE(MADV_POPULATE_READ),
+ INIT_ADVICE(MADV_POPULATE_WRITE),
+ };
+ char *addr;
+ int ret, i;
+
+ addr = mmap(0, pagesize, PROT_READ, MAP_SHARED, dev_mem_fd, 0);

Nit (same for all mmap() calls) shouldn't this first parameter be NULL, by
convention? I mean not a big deal obviously :)

Yes.


+ if (addr == MAP_FAILED)
+ ksft_exit_fail_msg("mmap() failed: %s\n", strerror(errno));
+
+ /* All these advices must be rejected. */
+ for (i = 0; i < ARRAY_SIZE(advices); i++) {
+ ret = madvise(addr, pagesize, advices[i].nr);
+ ksft_test_result(ret && errno == EINVAL,
+ "madvise(%s) should be disallowed\n",
+ advices[i].name);
+ }
+
+ munmap(addr, pagesize);
+}
+
+static void test_munmap_splitting(void)
+{
+ char *addr1, *addr2;
+ int ret;
+
+ addr1 = mmap(0, pagesize * 2, PROT_READ, MAP_SHARED, dev_mem_fd, 0);
+ if (addr1 == MAP_FAILED)
+ ksft_exit_fail_msg("mmap() failed: %s\n", strerror(errno));
+
+ /* Unmap the first pages. */

NIT: pages -> page.

Ack.


+ ret = munmap(addr1, pagesize);
+ ksft_test_result(!ret, "munmap() splitting\n");
+
+ /* Remap the first page while the second page is still mapped. */
+ addr2 = mmap(0, pagesize, PROT_READ, MAP_SHARED, dev_mem_fd, 0);
+ ksft_test_result(addr2 != MAP_FAILED, "mmap() after splitting\n");

Hm not sure what the assertion is here per se, that we can munmap() partial
bits of the VMA? It'd be pretty weird if we couldn't though?
> > If it's that we don't get a merge when we remap, we're not really checking
that, but you actually can, as I added an API to vm_util for this using
PROCMAP_QUERY (very handy tool actually - binary version of /proc/smaps).

I don't care about merging tests (I'll leave that to you :P ).

This is a PAT test for upcoming changes where partial unmap can leave the original region reserved. Making sure that re-mapping with the pending reservation still works.

+
+ if (addr2 != MAP_FAILED)
+ munmap(addr2, pagesize);
+ if (!ret)
+ munmap(addr1 + pagesize, pagesize);
+ else
+ munmap(addr1, pagesize * 2);

There's no need for this dance, you can just munmap() away, it tolerates
gaps and multiple VMAs.

Yeah, I know. I was not sure if the ksft_test_result() in between might allocate memory and consume that area.


+}
+
+static void test_mremap_fixed(void)
+{
+ char *addr, *new_addr, *ret;
+
+ addr = mmap(0, pagesize * 2, PROT_READ, MAP_SHARED, dev_mem_fd, 0);
+ if (addr == MAP_FAILED)
+ ksft_exit_fail_msg("mmap() failed: %s\n", strerror(errno));
+
+ /* Reserve a destination area. */
+ new_addr = mmap(0, pagesize * 2, PROT_READ, MAP_ANON | MAP_PRIVATE, -1, 0);
+ if (new_addr == MAP_FAILED)
+ ksft_exit_fail_msg("mmap() failed: %s\n", strerror(errno));
+
+ /* mremap() over our destination. */
+ ret = mremap(addr, pagesize * 2, pagesize * 2,
+ MREMAP_FIXED | MREMAP_MAYMOVE, new_addr);
+ ksft_test_result(ret == new_addr, "mremap(MREMAP_FIXED)\n");
+ if (ret != new_addr)
+ munmap(new_addr, pagesize * 2);

This could only be an error code, and this will fail right?

MREMAP_FIXED is 'do or die' at the new address, not hinting. If there's
anything already mapped there it goes a bye bye.

So again, we could just have a standard munmap(), and this lends itself
well to a FIXTURE_SETUP()/FIXTURE_TEARDOWN() :P

I'm afraid I cannot spend much more time on these tests :P But let me try for a couple of minutes.


+ munmap(addr, pagesize * 2);
+}
+
+static void test_mremap_shrinking(void)
+{
+ char *addr, *ret;
+
+ addr = mmap(0, pagesize * 2, PROT_READ, MAP_SHARED, dev_mem_fd, 0);
+ if (addr == MAP_FAILED)
+ ksft_exit_fail_msg("mmap() failed: %s\n", strerror(errno));
+
+ /* Shrinking is expected to work. */
+ ret = mremap(addr, pagesize * 2, pagesize, 0);
+ ksft_test_result(ret == addr, "mremap() shrinking\n");
+ if (ret != addr)
+ munmap(addr, pagesize * 2);
+ else
+ munmap(addr, pagesize);

I think we're safe to just munmap() as usual here :) (it's nitty but I'm
trying to make the case for teardown again of course :P)

Same reasoning as above regarding ksft_test_result().


+}
+
+static void test_mremap_growing(void)
+{
+ char *addr, *ret;
+
+ addr = mmap(0, pagesize, PROT_READ, MAP_SHARED, dev_mem_fd, 0);
+ if (addr == MAP_FAILED)
+ ksft_exit_fail_msg("mmap() failed: %s\n", strerror(errno));
+
+ /* Growing is not expected to work. */

God imagine if we did allow it... what hell would it be to figure out how
to do this correctly in all cases :P

:)


+ ret = mremap(addr, pagesize, pagesize * 2, MREMAP_MAYMOVE);
+ ksft_test_result(ret == MAP_FAILED,
+ "mremap() growing should be disallowed\n");
+ if (ret == MAP_FAILED)
+ munmap(addr, pagesize);
+ else
+ munmap(ret, pagesize * 2);

This is a bit cautious, for a world where we do lose our minds and allow
this? :)

Yeah, went back and forth with this error cleanup shit.


+}
+
+static void test_mprotect(void)
+{
+ char *addr, tmp;
+ int ret;
+
+ addr = mmap(0, pagesize, PROT_READ, MAP_SHARED, dev_mem_fd, 0);
+ if (addr == MAP_FAILED)
+ ksft_exit_fail_msg("mmap() failed: %s\n", strerror(errno));
+
+ /* With PROT_NONE, read access must result in SIGSEGV. */
+ ret = mprotect(addr, pagesize, PROT_NONE);
+ ksft_test_result(!ret, "mprotect(PROT_NONE)\n");
+
+ ret = sigsetjmp(env, 1);
+ if (!ret) {
+ tmp = *addr;
+ asm volatile("" : "+r" (tmp));
+ }

This code is duplicated, we definitely want to abstract it.

Probably yes.


+ ksft_test_result(ret == 1, "SIGSEGV expected\n");

Hmm, what exactly are we testing here though? I mean PROT_NONE will be a
failed access for _any_ kind of memory? Is this really worthwhile? Maybe
better to mprotect() as PROT_NONE to start then mprotect() to PROT_READ.
> > But I'm not sure what that really tests? Is it a PAT-specific thing? It
seems if this is broken then the mapping code is more generally broken
beyond just VM_PFNMAP mappings right?

Rationale was to test the !vm_normal_folio() code paths that are not covered by "ordinary" mprotect (except the shared zeropage). But there should indeed only be such a check on the prot_numa code path, so I can just drop this test.

[...]

+int main(int argc, char **argv)
+{
+ int err;
+
+ ksft_print_header();
+ ksft_set_plan(19);

I know it's kind of nitpicky, but I really hate this sort of magic number
and so on. You don't actually need any of this, the kselftest_harness.h is
_really_ powerful, and makes for much much more readable and standardised
test code.

You can look at guard-regions.c in the test code (though there's some
complexity there because I use 'variants') or the merge.c test code
(simpler) for straight-forward examples.

I won't block this change on this however, I don't want to be a pain and
you're adding very important tests here, but it'd be really nice if you did
use that :>)

Yeah, let me explore that real quick, thanks!

--
Cheers,

David / dhildenb