Re: [syzbot] [mm?] kernel BUG in sanity_check_pinned_pages

From: David Hildenbrand
Date: Mon Jun 23 2025 - 06:10:37 EST


On 23.06.25 11:53, Alexander Potapenko wrote:
On Mon, Jun 23, 2025 at 11:29 AM 'David Hildenbrand' via
syzkaller-bugs <syzkaller-bugs@xxxxxxxxxxxxxxxx> wrote:

On 21.06.25 23:52, syzbot wrote:
syzbot has found a reproducer for the following issue on:

HEAD commit: 9aa9b43d689e Merge branch 'for-next/core' into for-kernelci
git tree: git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci
console output: https://syzkaller.appspot.com/x/log.txt?x=1525330c580000
kernel config: https://syzkaller.appspot.com/x/.config?x=27f179c74d5c35cd
dashboard link: https://syzkaller.appspot.com/bug?extid=1d335893772467199ab6
compiler: Debian clang version 20.1.6 (++20250514063057+1e4d39e07757-1~exp1~20250514183223.118), Debian LLD 20.1.6
userspace arch: arm64
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16d73370580000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=160ef30c580000

There is not that much magic in there, I'm afraid.

fork() is only used to spin up guests, but before the memory region of
interest is actually allocated, IIUC. No threading code that races.

IIUC, it triggers fairly fast on aarch64. I've left it running for a
while on x86_64 without any luck.

So maybe this is really some aarch64-special stuff (pointer tagging?).

In particular, there is something very weird in the reproducer:

syscall(__NR_madvise, /*addr=*/0x20a93000ul, /*len=*/0x4000ul,
/*advice=MADV_HUGEPAGE|0x800000000*/ 0x80000000eul);

advise is supposed to be a 32bit int. What does the magical
"0x800000000" do?

I am pretty sure this is a red herring.
Syzkaller sometimes mutates integer flags, even if the result makes no
sense - because sometimes it can trigger interesting bugs.
This `advice` argument will be discarded by is_valid_madvise(),
resulting in -EINVAL.

I thought the same, but likely the upper bits are discarded, and we end up with __NR_madvise succeeding.

The kernel config has

CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y

So without MADV_HUGEPAGE, we wouldn't get a THP in the first place.

So likely this is really just like dropping the "0x800000000"

Anyhow, I managed to reproduce in the VM using the provided rootfs on aarch64. It triggers immediately, so no races involved.

Running the reproducer on a Fedora 42 debug-kernel in the hypervisor does not trigger.

--
Cheers,

David / dhildenb