Re: [PATCHv2 0/4] zsmalloc: make zspage chain size configurable

From: Mike Kravetz
Date: Fri Jan 13 2023 - 14:57:53 EST


On 01/09/23 12:38, Sergey Senozhatsky wrote:
> Hi,
>
> This turns hard coded limit on maximum number of physical
> pages per-zspage into a config option. It also increases the default
> limit from 4 to 8.
>
> Sergey Senozhatsky (4):
> zsmalloc: rework zspage chain size selection
> zsmalloc: skip chain size calculation for pow_of_2 classes
> zsmalloc: make zspage chain size configurable
> zsmalloc: set default zspage chain size to 8
>
> Documentation/mm/zsmalloc.rst | 168 ++++++++++++++++++++++++++++++++++
> mm/Kconfig | 19 ++++
> mm/zsmalloc.c | 72 +++++----------
> 3 files changed, 212 insertions(+), 47 deletions(-)

Hi Sergey,

The following BUG shows up after this series in linux-next. I can easily
recreate by doing the following:

# echo large_value > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
where 'large_value' is a so big that there could never possibly be that
many 2MB huge pages in the system.

--
Mike Kravetz

[ 22.981684] ------------[ cut here ]------------
[ 22.982990] kernel BUG at mm/zsmalloc.c:1982!
[ 22.984204] invalid opcode: 0000 [#1] PREEMPT SMP PTI
[ 22.985561] CPU: 0 PID: 41 Comm: kcompactd0 Not tainted 6.2.0-rc3+ #13
[ 22.987430] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.1-2.fc37 04/01/2014
[ 22.989728] RIP: 0010:zs_page_migrate+0x43c/0x490
[ 22.991070] Code: c7 c6 c8 f6 21 82 e8 b3 73 f6 ff 0f 0b 0f 1f 44 00 00 e9 20 fd ff ff 0f 1f 44 00 00 e9 9e fd ff ff 48 83 ef 01 e9 6b fe ff ff <0f> 0b 48 8b 43 20 49 89 45 20 e9 ff fd ff ff 48 c7 c6 60 d3 1d 82
[ 22.995900] RSP: 0018:ffffc9000121fb20 EFLAGS: 00010246
[ 22.997364] RAX: 0000000000000002 RBX: ffffea0005b8b380 RCX: 0000000000000000
[ 22.999299] RDX: 0000000000000002 RSI: ffffffff81e28a62 RDI: 00000000ffffffff
[ 23.001236] RBP: ffff88816e2cf000 R08: ffffea0005b8b340 R09: 0000000000000008
[ 23.003181] R10: ffff88827fffafe0 R11: 0000000000280000 R12: ffff88816e2cf400
[ 23.005038] R13: ffffea0009e7f800 R14: ffff88817d783880 R15: ffff8881036a44d8
[ 23.006921] FS: 0000000000000000(0000) GS:ffff888277c00000(0000) knlGS:0000000000000000
[ 23.009116] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 23.010732] CR2: 00007f8b14e20550 CR3: 0000000103026004 CR4: 0000000000370ef0
[ 23.013978] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 23.015931] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 23.017892] Call Trace:
[ 23.018664] <TASK>
[ 23.019345] move_to_new_folio+0x14d/0x1f0
[ 23.020710] migrate_pages+0xe36/0x1240
[ 23.021895] ? __pfx_compaction_alloc+0x10/0x10
[ 23.023202] ? _raw_write_lock+0x13/0x30
[ 23.024335] ? __pfx_compaction_free+0x10/0x10
[ 23.025608] ? isolate_movable_page+0xff/0x250
[ 23.026880] compact_zone+0x9da/0xdf0
[ 23.027990] kcompactd_do_work+0x1d2/0x2c0
[ 23.029180] kcompactd+0x220/0x3e0
[ 23.030166] ? __pfx_autoremove_wake_function+0x10/0x10
[ 23.031612] ? __pfx_kcompactd+0x10/0x10
[ 23.032706] kthread+0xe6/0x110
[ 23.033648] ? __pfx_kthread+0x10/0x10
[ 23.034704] ret_from_fork+0x29/0x50
[ 23.035734] </TASK>
[ 23.036443] Modules linked in: rfkill ip6table_filter ip6_tables sunrpc snd_hda_codec_generic snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hwdep snd_hda_core snd_seq snd_seq_device 9p netfs snd_pcm joydev 9pnet_virtio virtio_balloon snd_timer snd soundcore 9pnet virtio_blk virtio_net net_failover failover virtio_console crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel serio_raw virtio_pci virtio virtio_pci_legacy_dev virtio_pci_modern_dev virtio_ring fuse
[ 23.049869] ---[ end trace 0000000000000000 ]---
[ 23.051154] RIP: 0010:zs_page_migrate+0x43c/0x490
[ 23.052466] Code: c7 c6 c8 f6 21 82 e8 b3 73 f6 ff 0f 0b 0f 1f 44 00 00 e9 20 fd ff ff 0f 1f 44 00 00 e9 9e fd ff ff 48 83 ef 01 e9 6b fe ff ff <0f> 0b 48 8b 43 20 49 89 45 20 e9 ff fd ff ff 48 c7 c6 60 d3 1d 82
[ 23.057413] RSP: 0018:ffffc9000121fb20 EFLAGS: 00010246
[ 23.058892] RAX: 0000000000000002 RBX: ffffea0005b8b380 RCX: 0000000000000000
[ 23.060867] RDX: 0000000000000002 RSI: ffffffff81e28a62 RDI: 00000000ffffffff
[ 23.062835] RBP: ffff88816e2cf000 R08: ffffea0005b8b340 R09: 0000000000000008
[ 23.064825] R10: ffff88827fffafe0 R11: 0000000000280000 R12: ffff88816e2cf400
[ 23.066806] R13: ffffea0009e7f800 R14: ffff88817d783880 R15: ffff8881036a44d8
[ 23.068738] FS: 0000000000000000(0000) GS:ffff888277c00000(0000) knlGS:0000000000000000
[ 23.071022] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 23.072579] CR2: 00007f8b14e20550 CR3: 0000000103026004 CR4: 0000000000370ef0
[ 23.076152] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 23.078172] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 23.080134] note: kcompactd0[41] exited with preempt_count 1