Re: [PATCH] mm: fix trying to reclaim unevicable LRU page

From: Michal Hocko
Date: Mon Jun 10 2019 - 06:44:11 EST


On Mon 10-06-19 18:42:22, Minchan Kim wrote:
> On Tue, Jun 04, 2019 at 02:28:06PM +0200, Michal Hocko wrote:
> > On Thu 30-05-19 11:42:29, Minchan Kim wrote:
> > > On Tue, May 28, 2019 at 05:14:07PM +0200, Michal Hocko wrote:
> > > > [Cc Pankaj Suryawanshi who has reported a similar problem
> > > > http://lkml.kernel.org/r/SG2PR02MB309806967AE91179CAFEC34BE84B0@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx]
> > > >
> > > > On Fri 24-05-19 16:11:14, Minchan Kim wrote:
> > > > > There was below bugreport from Wu Fangsuo.
> > > > >
> > > > > 7200 [ 680.491097] c4 7125 (syz-executor) page:ffffffbf02f33b40 count:86 mapcount:84 mapping:ffffffc08fa7a810 index:0x24
> > > > > 7201 [ 680.531186] c4 7125 (syz-executor) flags: 0x19040c(referenced|uptodate|arch_1|mappedtodisk|unevictable|mlocked)
> > > > > 7202 [ 680.544987] c0 7125 (syz-executor) raw: 000000000019040c ffffffc08fa7a810 0000000000000024 0000005600000053
> > > > > 7203 [ 680.556162] c0 7125 (syz-executor) raw: ffffffc009b05b20 ffffffc009b05b20 0000000000000000 ffffffc09bf3ee80
> > > > > 7204 [ 680.566860] c0 7125 (syz-executor) page dumped because: VM_BUG_ON_PAGE(PageLRU(page) || PageUnevictable(page))
> > > > > 7205 [ 680.578038] c0 7125 (syz-executor) page->mem_cgroup:ffffffc09bf3ee80
> > > > > 7206 [ 680.585467] c0 7125 (syz-executor) ------------[ cut here ]------------
> > > > > 7207 [ 680.592466] c0 7125 (syz-executor) kernel BUG at /home/build/farmland/adroid9.0/kernel/linux/mm/vmscan.c:1350!
> > > > > 7223 [ 680.603663] c0 7125 (syz-executor) Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
> > > > > 7224 [ 680.611436] c0 7125 (syz-executor) Modules linked in:
> > > > > 7225 [ 680.616769] c0 7125 (syz-executor) CPU: 0 PID: 7125 Comm: syz-executor Tainted: G S 4.14.81 #3
> > > > > 7226 [ 680.626826] c0 7125 (syz-executor) Hardware name: ASR AQUILAC EVB (DT)
> > > > > 7227 [ 680.633623] c0 7125 (syz-executor) task: ffffffc00a54cd00 task.stack: ffffffc009b00000
> > > > > 7228 [ 680.641917] c0 7125 (syz-executor) PC is at shrink_page_list+0x1998/0x3240
> > > > > 7229 [ 680.649144] c0 7125 (syz-executor) LR is at shrink_page_list+0x1998/0x3240
> > > > > 7230 [ 680.656303] c0 7125 (syz-executor) pc : [<ffffff90083a2158>] lr : [<ffffff90083a2158>] pstate: 60400045
> > > > > 7231 [ 680.666086] c0 7125 (syz-executor) sp : ffffffc009b05940
> > > > > ..
> > > > > 7342 [ 681.671308] c0 7125 (syz-executor) [<ffffff90083a2158>] shrink_page_list+0x1998/0x3240
> > > > > 7343 [ 681.679567] c0 7125 (syz-executor) [<ffffff90083a3dc0>] reclaim_clean_pages_from_list+0x3c0/0x4f0
> > > > > 7344 [ 681.688793] c0 7125 (syz-executor) [<ffffff900837ed64>] alloc_contig_range+0x3bc/0x650
> > > > > 7347 [ 681.717421] c0 7125 (syz-executor) [<ffffff90084925cc>] cma_alloc+0x214/0x668
> > > > > 7348 [ 681.724892] c0 7125 (syz-executor) [<ffffff90091e4d78>] ion_cma_allocate+0x98/0x1d8
> > > > > 7349 [ 681.732872] c0 7125 (syz-executor) [<ffffff90091e0b20>] ion_alloc+0x200/0x7e0
> > > > > 7350 [ 681.740302] c0 7125 (syz-executor) [<ffffff90091e154c>] ion_ioctl+0x18c/0x378
> > > > > 7351 [ 681.747738] c0 7125 (syz-executor) [<ffffff90084c6824>] do_vfs_ioctl+0x17c/0x1780
> > > > > 7352 [ 681.755514] c0 7125 (syz-executor) [<ffffff90084c7ed4>] SyS_ioctl+0xac/0xc0
> > > > >
> > > > > Wu found it's due to [1]. Before that, unevictable page goes to cull_mlocked
> > > > > routine so that it couldn't reach the VM_BUG_ON_PAGE line.
> > > > >
> > > > > To fix the issue, this patch filter out unevictable LRU pages
> > > > > from the reclaim_clean_pages_from_list in CMA.
> > > >
> > > > The changelog is rather modest on details and I have to confess I have
> > > > little bit hard time to understand it. E.g. why do not we need to handle
> > > > the regular reclaim path?
> > >
> > > No need to pass unevictable pages into regular reclaim patch if we are
> > > able to know in advance.
> >
> > I am sorry to be dense here. So what is the difference in the CMA path?
> > Am I right that the pfn walk (CMA) rather than LRU isolation (reclaim)
> > is the key differentiator?
>
> Yes.
> We could isolate unevictable LRU pages from the pfn waker to migrate and
> could discard clean file-backed pages to reduce migration latency in CMA
> path.

Please be explicit about that in the changelog. The fact that this is
not possible from the regular reclaim path is really important and not
obvious from the first glance.

Thanks!

--
Michal Hocko
SUSE Labs