Re: v5.18-rc1: migratepages triggers VM_BUG_ON_FOLIO(folio_nr_pages(old) != nr_pages)

From: Matthew Wilcox
Date: Mon Apr 04 2022 - 10:29:26 EST


On Mon, Apr 04, 2022 at 10:05:00AM -0400, Zi Yan wrote:
> On 4 Apr 2022, at 9:29, Naoya Horiguchi wrote:
> > I found that the below VM_BUG_ON_FOLIO is triggered on v5.18-rc1
> > (and also reproducible with mmotm on 3/31).
> > I have no idea about the bug's mechanism, but it seems not to be
> > shared in LKML yet, so let me just share. config.gz is attached.
> >
> > [ 48.206424] page:0000000021452e3a refcount:6 mapcount:0 mapping:000000003aaf5253 index:0x0 pfn:0x14e600
> > [ 48.213316] head:0000000021452e3a order:9 compound_mapcount:0 compound_pincount:0
> > [ 48.218830] aops:xfs_address_space_operations [xfs] ino:dee dentry name:"libc.so.6"
> > [ 48.225098] flags: 0x57ffffc0012027(locked|referenced|uptodate|active|private|head|node=1|zone=2|lastcpupid=0x1fffff)
> > [ 48.232792] raw: 0057ffffc0012027 0000000000000000 dead000000000122 ffff8a0dc9a376b8
> > [ 48.238464] raw: 0000000000000000 ffff8a0dc6b23d20 00000006ffffffff 0000000000000000
> > [ 48.244109] page dumped because: VM_BUG_ON_FOLIO(folio_nr_pages(old) != nr_pages)
> > [ 48.249196] ------------[ cut here ]------------
> > [ 48.251240] kernel BUG at mm/memcontrol.c:6857!
> > [ 48.260535] RIP: 0010:mem_cgroup_migrate+0x217/0x320
> > [ 48.286942] Call Trace:
> > [ 48.287665] <TASK>
> > [ 48.288255] iomap_migrate_page+0x64/0x190
> > [ 48.289366] move_to_new_page+0xa3/0x470
>
> Is it because migration code assumes all THPs have order=HPAGE_PMD_ORDER?
> Would the patch below fix the issue?

This looks entirely plausible to me! I do have changes in this area,
but clearly I should have submitted them earlier. Let's get these fixes
in as they are.

Is there a test suite that tests page migration? I usually use xfstests
and it does no page migration at all (at least 'git grep migrate'
finds nothing useful).