Re: mm: mkfs.ext4 invoked oom-killer on i386 - pagecache_get_page

From: Naresh Kamboju
Date: Wed May 20 2020 - 07:56:33 EST


FYI,

This issue is specific on 32-bit architectures i386 and arm on linux-next tree.
As per the test results history this problem started happening from
Bad : next-20200430
Good : next-20200429

steps to reproduce:
dd if=/dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190504A00573
of=/dev/null bs=1M count=2048
or
mkfs -t ext4 /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190804A00BE5


Problem:
[ 38.802375] dd invoked oom-killer: gfp_mask=0x100cc0(GFP_USER),
order=0, oom_score_adj=0

i386 crash log: https://pastebin.com/Hb8U89vU
arm crash log: https://pastebin.com/BD9t3JTm

On Tue, 19 May 2020 at 14:15, Michal Hocko <mhocko@xxxxxxxxxx> wrote:
>
> On Tue 19-05-20 10:11:25, Arnd Bergmann wrote:
> > On Tue, May 19, 2020 at 9:52 AM Michal Hocko <mhocko@xxxxxxxxxx> wrote:
> > >
> > > On Mon 18-05-20 19:40:55, Naresh Kamboju wrote:
> > > > Thanks for looking into this problem.
> > > >
> > > > On Sat, 2 May 2020 at 02:28, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> > > > >
> > > > > On Fri, 1 May 2020 18:08:28 +0530 Naresh Kamboju <naresh.kamboju@xxxxxxxxxx> wrote:
> > > > >
> > > > > > mkfs -t ext4 invoked oom-killer on i386 kernel running on x86_64 device
> > > > > > and started happening on linux -next master branch kernel tag next-20200430
> > > > > > and next-20200501. We did not bisect this problem.
> > > [...]
> > > > Creating journal (131072 blocks): [ 31.251333] mkfs.ext4 invoked
> > > > oom-killer: gfp_mask=0x101cc0(GFP_USER|__GFP_WRITE), order=0,
> > > > oom_score_adj=0
> > > [...]
> > > > [ 31.500943] DMA free:187396kB min:22528kB low:28160kB high:33792kB
> > > > reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB
> > > > active_file:4736kB inactive_file:431688kB unevictable:0kB
> > > > writepending:62020kB present:783360kB managed:668264kB mlocked:0kB
> > > > kernel_stack:888kB pagetables:0kB bounce:0kB free_pcp:880kB
> > > > local_pcp:216kB free_cma:163840kB
> > >
> > > This is really unexpected. You are saying this is a regular i386 and DMA
> > > should be bottom 16MB while yours is 780MB and the rest of the low mem
> > > is in the Normal zone which is completely missing here. How have you got
> > > to that configuration? I have to say I haven't seen anything like that
> > > on i386.
> >
> > I think that line comes from an ARM32 beaglebone-X15 machine showing
> > the same symptom. The i386 line from the log file that Naresh linked to at
> > https://lkft.validation.linaro.org/scheduler/job/1406110#L1223 is less
> > unusual:
>
> OK, that makes more sense! At least for the memory layout.
>
> > [ 34.931663] Node 0 active_anon:21464kB inactive_anon:8688kB
> > active_file:16604kB inactive_file:849976kB unevictable:0kB
> > isolated(anon):0kB isolated(file):0kB mapped:25284kB dirty:58952kB
> > writeback:27772kB shmem:8944kB writeback_tmp:0kB unstable:0kB
> > all_unreclaimable? yes
> > [ 34.955523] DMA free:3356kB min:68kB low:84kB high:100kB
> > reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB
> > active_file:0kB inactive_file:11964kB unevictable:0kB
> > writepending:11980kB present:15964kB managed:15876kB mlocked:0kB
> > kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB
> > free_cma:0kB
> > [ 34.983385] lowmem_reserve[]: 0 825 1947 825
> > [ 34.987678] Normal free:3948kB min:7732kB low:8640kB high:9548kB
> > reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB
> > active_file:1096kB inactive_file:786400kB unevictable:0kB
> > writepending:65432kB present:884728kB managed:845576kB mlocked:0kB
> > kernel_stack:1112kB pagetables:0kB bounce:0kB free_pcp:2908kB
> > local_pcp:500kB free_cma:0kB
>
> The lowmem is really low (way below the min watermark so even memory
> reserves for high priority and atomic requests are depleted. There is
> still 786MB of inactive page cache to be reclaimed. It doesn't seem to
> be dirty or under the writeback but it still might be pinned by the
> filesystem. I would suggest watching vmscan reclaim tracepoints and
> check why the reclaim fails to reclaim anything.
> --
> Michal Hocko
> SUSE Labs