Re: [PATCH v10 13/69] mm/mmap: use maple tree for unmapped_area{_topdown}

From: Alexander Gordeev
Date: Sat Jul 09 2022 - 03:31:18 EST


On Tue, Jun 21, 2022 at 08:46:55PM +0000, Liam Howlett wrote:
> From: "Liam R. Howlett" <Liam.Howlett@xxxxxxxxxx>
>
> The maple tree code was added to find the unmapped area in a previous
> commit and was checked against what the rbtree returned, but the actual
> result was never used. Start using the maple tree implementation and
> remove the rbtree code.
>
> Add kernel documentation comment for these functions.

Hi Liam,

With this update a user process crash is triggered on s390 when
the below core is executed (derived from LTP fork14 testcase):

#include <unistd.h>
#include <sys/mman.h>

#define GB (1024 * 1024 * 1024L)
#define EXTENT (16 * 1024 + 10)

int main(int argc, char **argv)
{
void *addr;
int i;

for (i = 0; i < EXTENT; i++) {
addr = mmap(NULL, 1 * GB, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
if (addr == MAP_FAILED)
break;
}

return 0;
}

On 4095-th iteration mmap() returns a normal address, but shared
library mappings go away. The page tables seem to be intact as the
memory is still available (I did not check every mapping gone though).
In addition, the memory contents of disappeared mappings is zeroed.
As result, an instruction that follows the mmap() system call turns
into invalid operation code:

t35lp64 login: [45116.631391] User process fault: interruption code 0004 ilc:1
[45116.631403] Failing address: 000003ffa580c000 TEID: 000003ffa580c884
[45116.631405] Fault in primary space mode while using user ASCE.
[45116.631407] AS:00000000e75fc1c7 R3:00000000e758c007 S:00000000a3e01701
[45116.631411] CPU: 4 PID: 1745 Comm: mmap Not tainted 5.19.0-rc4-00162-g34de4ebd5706 #36
[45116.631414] Hardware name: IBM 8561 T01 703 (LPAR)
[45116.631416] User PSW : 0705000180000000 000003ffa580cc38
[45116.631418] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:1 AS:0 CC:0 PM:0 RI:0 EA:3
[45116.631420] User GPRS: 0000000000000000 000003ffa5af4040 000003ff65afb000 0000000040000000
[45116.631422] 0000000000000003 0000000000000022 0000000000000000 0000000001003e00
[45116.631423] 000003ffa5ab0b48 000003ffa5ab1018 0000000000000001 000003fff5879500
[45116.631425] 000003ffa5ab0f70 0000000000000000 0000000001001218 000003fff5879428
[45116.631429] User Code: 000003ffa580cc32: 0000 illegal
[45116.631429] 000003ffa580cc34: 0000 illegal
[45116.631429] #000003ffa580cc36: 0000 illegal
[45116.631429] >000003ffa580cc38: 0000 illegal
[45116.631429] 000003ffa580cc3a: 0000 illegal
[45116.631429] 000003ffa580cc3c: 0000 illegal
[45116.631429] 000003ffa580cc3e: 0000 illegal
[45116.631429] 000003ffa580cc40: 0000 illegal
[45116.631437] Last Breaking-Event-Address:
[45116.631438] [<0000000000000001>] 0x1

In other words, if before the mmap() call memory mappings look like this:

Start Addr End Addr Size Offset Perms objfile
0x1000000 0x1001000 0x1000 0x0 r--p /root/main/mmap
0x1001000 0x1002000 0x1000 0x1000 r-xp /root/main/mmap
0x1002000 0x1003000 0x1000 0x2000 r--p /root/main/mmap
0x1003000 0x1004000 0x1000 0x2000 r--p /root/main/mmap
0x1004000 0x1005000 0x1000 0x3000 rw-p /root/main/mmap
0x3fff7c00000 0x3fff7c2b000 0x2b000 0x0 r--p /usr/lib64/libc.so.6
0x3fff7c2b000 0x3fff7d64000 0x139000 0x2b000 r-xp /usr/lib64/libc.so.6
0x3fff7d64000 0x3fff7dc3000 0x5f000 0x164000 r--p /usr/lib64/libc.so.6
0x3fff7dc3000 0x3fff7dc4000 0x1000 0x1c3000 ---p /usr/lib64/libc.so.6
0x3fff7dc4000 0x3fff7dc8000 0x4000 0x1c3000 r--p /usr/lib64/libc.so.6
0x3fff7dc8000 0x3fff7dca000 0x2000 0x1c7000 rw-p /usr/lib64/libc.so.6
0x3fff7dca000 0x3fff7dd2000 0x8000 0x0 rw-p
0x3fff7f80000 0x3fff7f82000 0x2000 0x0 r--p /usr/lib/ld64.so.1
0x3fff7f82000 0x3fff7fa3000 0x21000 0x2000 r-xp /usr/lib/ld64.so.1
0x3fff7fa3000 0x3fff7faf000 0xc000 0x23000 r--p /usr/lib/ld64.so.1
0x3fff7faf000 0x3fff7fb1000 0x2000 0x2e000 r--p /usr/lib/ld64.so.1
0x3fff7fb1000 0x3fff7fb3000 0x2000 0x30000 rw-p /usr/lib/ld64.so.1
0x3fff7ff3000 0x3fff7ffb000 0x8000 0x0 rw-p
0x3fffffda000 0x3ffffffb000 0x21000 0x0 rw-p [stack]
0x3ffffffc000 0x3ffffffe000 0x2000 0x0 r--p [vvar]
0x3ffffffe000 0x40000000000 0x2000 0x0 r-xp [vdso]

Then after mmap() returns it turns into:

Start Addr End Addr Size Offset Perms objfile
0x1000000 0x1001000 0x1000 0x0 r--p /root/main/mmap
0x1001000 0x1002000 0x1000 0x1000 r-xp /root/main/mmap
0x1002000 0x1003000 0x1000 0x2000 r--p /root/main/mmap
0x1003000 0x1004000 0x1000 0x2000 r--p /root/main/mmap
0x1004000 0x1005000 0x1000 0x3000 rw-p /root/main/mmap
0x37c00000 0x3fff7ffb000 0x3ffc03fb000 0x0 rw-p
0x3fffffda000 0x3ffffffb000 0x21000 0x0 rw-p [stack]
0x3ffffffc000 0x3ffffffe000 0x2000 0x0 r--p [vvar]
0x3ffffffe000 0x40000000000 0x2000 0x0 r-xp [vdso]

Interestingly, all addresses mmap() returns before the problem hits are
1MB-aligned, while the last one that screws the mappings is always page-
aligned. Also, the iteration number 4095 suggests some arithmetics that
leads to an integer overflow.

I did not experiment much with x86, but the problem does not hit there.
The config has CONFIG_PGTABLE_LEVELS=5, but I am not sure about other
options that may be involved.

The tree I used to isolate the issue:

git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm mm-everything

It (looks like it) gets pulled into every linux-next, so the problem
is reproducable there as well.

As we are approaching the merge window that looks pretty worrisome. I will
try to get more details on what is going on, but may be you have an immediate
idea?

Thanks!