Re: [PATCH] mm/memcg: fix NULL pointer dereference in memcg_slab_free_hook()

From: Shakeel Butt
Date: Wed Jul 28 2021 - 10:27:32 EST


On Wed, Jul 28, 2021 at 7:21 AM Kefeng Wang <wangkefeng.wang@xxxxxxxxxx> wrote:
>
>
> On 2021/7/28 21:23, Michal Hocko wrote:
> > On Wed 28-07-21 17:13:48, Wang Hai wrote:
> >> When I use kfree_rcu() to free a large memory allocated by
> >> kmalloc_node(), the following dump occurs.
> >>
> >> BUG: kernel NULL pointer dereference, address: 0000000000000020
> >> [...]
> >> Oops: 0000 [#1] SMP
> >> [...]
> >> Workqueue: events kfree_rcu_work
> >> RIP: 0010:__obj_to_index include/linux/slub_def.h:182 [inline]
> >> RIP: 0010:obj_to_index include/linux/slub_def.h:191 [inline]
> >> RIP: 0010:memcg_slab_free_hook+0x120/0x260 mm/slab.h:363
> >> [...]
> >> Call Trace:
> >> kmem_cache_free_bulk+0x58/0x630 mm/slub.c:3293
> >> kfree_bulk include/linux/slab.h:413 [inline]
> >> kfree_rcu_work+0x1ab/0x200 kernel/rcu/tree.c:3300
> >> process_one_work+0x207/0x530 kernel/workqueue.c:2276
> >> worker_thread+0x320/0x610 kernel/workqueue.c:2422
> >> kthread+0x13d/0x160 kernel/kthread.c:313
> >> ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
> >>
> >> When kmalloc_node() a large memory, page is allocated, not slab,
> >> so when freeing memory via kfree_rcu(), this large memory should not
> >> be used by memcg_slab_free_hook(), because memcg_slab_free_hook() is
> >> is used for slab.
> >>
> >> So in this case, there is no need to do anything with this large
> >> page in memcg_slab_free_hook(), just skip it.
> >>
> >> Fixes: 270c6a71460e ("mm: memcontrol/slab: Use helpers to access slab page's memcg_data")
> > Are you sure that this commit is really breaking the code. Unless I have
> Yes, we confirmed that this commit introduces the bug.
> > missed something there shouldn't be any real change wrt. large
> > allocations here. page_has_obj_cgroups is just a different name for what
> > what page_objcgs is giving us.
>
> maybe we could simply use page_objcgs_check to fix the issue ? we will
> check it again.

You will see the same crash with page_objcgs_check as well.