Re: [REGRESSION v5.13-rc1] NULL dereference in do_shrink_slab()

From: Yang Shi
Date: Wed May 12 2021 - 20:58:42 EST


On Wed, May 12, 2021 at 5:10 PM NOMURA JUNICHI(野村 淳一)
<junichi.nomura@xxxxxxx> wrote:
>
> On 2021/05/13 1:31, Yang Shi wrote:
> > On Wed, May 12, 2021 at 5:36 AM Shakeel Butt <shakeelb@xxxxxxxxxx> wrote:
> >>
> >> +Tejun Heo
> >>
> >> On Wed, May 12, 2021 at 3:48 AM NOMURA JUNICHI(野村 淳一)
> >> <junichi.nomura@xxxxxxx> wrote:
> >>> With the commit 476b30a0949a, if a memcg-aware shrinker is registered before
> >>> cgroup_init(), shrinker->nr_deferred is NULL. However xchg_nr_deferred()
> >>> tries to use it as memcg is turned off via "cgroup_disable=memory".
> >>>
> >>> Any thoughts?
> >
> > Thanks for the report.
> >
> >>
> >> Is there a way to find the call chain of "memcg-aware shrinker is
> >> registered before cgroup_init()"?
> >
> > Other than adding some printk in prealloc_memcg_shrinker() then
> > checking out the output of dmesg I didn't think of a better way. Not
> > sure if we have something like early trace.
>
> This is the first registration of memcg-aware shrinker:
>
> [ 1.933693] Call Trace:
> [ 1.934694] sget_fc+0x20d/0x2f0
> [ 1.935693] ? compare_single+0x10/0x10
> [ 1.936693] ? shmem_create+0x30/0x30
> [ 1.937693] vfs_get_super+0x3e/0x100
> [ 1.938693] get_tree_nodev+0x16/0x20
> [ 1.939693] shmem_get_tree+0x15/0x20
> [ 1.940693] vfs_get_tree+0x2a/0xc0
> [ 1.941693] fc_mount+0x12/0x40
> [ 1.942693] vfs_kern_mount.part.43+0x61/0xa0
> [ 1.943693] kern_mount+0x24/0x40
> [ 1.944693] shmem_init+0x5c/0xc8
> [ 1.945693] mnt_init+0x12f/0x24a
> [ 1.946693] ? __percpu_counter_init+0x8f/0xb0
> [ 1.947693] vfs_caches_init+0xce/0xda
> [ 1.948693] start_kernel+0x479/0x4e3
> [ 1.949693] x86_64_start_reservations+0x24/0x26
> [ 1.950693] x86_64_start_kernel+0x8a/0x8d
> [ 1.951693] secondary_startup_64_no_verify+0xc2/0xcb
>
> That is done after command line parsing but before cgroup_init.

Thanks for sharing the log. I was not aware that shmem is initialized
and mounted so early.

>
> >> Irrespective I think we can revert a3e72739b7a7e ("cgroup: fix too
> >> early usage of static_branch_disable()") as 6041186a3258 ("init:
> >> initialize jump labels before command line option parsing") has moved
> >> the initialization of jump labels before command line parsing.
> >
> > Seems make sense to me. If some memcg aware shrinker is registered
> > before cgroup_init(), the mem_cgroup_disabled() check in
> > prealloc_memcg_shrinker() would return false negative. And I don't
> > think any shrinker could be registered before parsing boot
> > commandline.
>
> Thank you. Shakeel 's patch works for me:
>
> [PATCH] cgroup: disable controllers at parse time
> https://lore.kernel.org/linux-mm/20210512201946.2949351-1-shakeelb@xxxxxxxxxx/

Thanks for running the test.

>
> --
> Jun'ichi Nomura, NEC Corporation / NEC Solution Innovators, Ltd.