[PATCH] memcg: fix NULL pointer dereference in __mem_cgroup_usage_unregister_event

From: brookxu
Date: Thu Mar 05 2020 - 00:52:10 EST


One eventfd monitors multiple memory thresholds of cgroup, closing it, the
system will delete related events. Before all events are deleted, another
eventfd monitors the cgroup's memory threshold.

As a result, thresholds->primary[] is not empty, but thresholds->sparse[]
is NULL, __mem_cgroup_usage_unregister_event() leading to a crash:

[Â 138.925809] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
[Â 138.926817] IP: [<ffffffff8116c9b7>] mem_cgroup_usage_unregister_event+0xd7/0x1f0
[Â 138.927701] PGD 73bce067 PUD 76ff3067 PMD 0
[Â 138.928384] Oops: 0002 [#1] SMP
[Â 138.935218] CPU: 1 PID: 14 Comm: kworker/1:0 Not tainted 3.10.107-1-tlinux2-0047 #1
[Â 138.936076] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[Â 138.936988] Workqueue: events cgroup_event_remove
[Â 138.937581] task: ffff88007c07e440 ti: ffff88007c090000 task.ti: ffff88007c090000
[Â 138.938485] RIP: 0010:[<ffffffff8116c9b7>]Â [<ffffffff8116c9b7>] mem_cgroup_usage_unregister_event+0xd7/0x1f0
[Â 138.940116] RSP: 0018:ffff88007c093dc0Â EFLAGS: 00010202
[Â 138.941056] RAX: 0000000000000001 RBX: ffff880073b3e1a8 RCX: 0000000000000001
[Â 138.942095] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff880074519900
[Â 138.943129] RBP: ffff88007c093df0 R08: 0000000000000001 R09: 0000000000000000
[Â 138.946057] R10: 000000000000b95b R11: 0000000000000001 R12: ffff880076cc0480
[Â 138.947805] R13: ffff880073b3e1d0 R14: 0000000000000000 R15: 0000000000000000
[Â 138.948903] FS:Â 0000000000000000(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000
[Â 138.952264] CS:Â 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[Â 138.953123] CR2: 0000000000000004 CR3: 00000000753b3000 CR4: 00000000000406e0
[Â 138.954110] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[Â 138.963245] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[Â 138.964088] Stack:
[Â 138.964456]Â 0000000000000246 ffff880076d6df68 ffff8800751b4c00 ffff880076d6df00
[Â 138.965650]Â 0000000000000040 ffff880076d6df68 ffff88007c093e18 ffffffff810b17ba
[Â 138.966803]Â ffff88007d04cf80 ffff88007fd115c0 ffff88007fd15600 ffff88007c093e60
[Â 138.968179] Call Trace:
[Â 138.968592]Â [<ffffffff810b17ba>] cgroup_event_remove+0x3a/0x80
[Â 138.969321]Â [<ffffffff81066387>] process_one_work+0x177/0x450
[Â 138.970051]Â [<ffffffff8106721b>] worker_thread+0x11b/0x390
[Â 138.970741]Â [<ffffffff81067100>] ? manage_workers.isra.26+0x290/0x290
[Â 138.971612]Â [<ffffffff8106dacf>] kthread+0xcf/0xe0
[Â 138.972340]Â [<ffffffff8106da00>] ? insert_kthread_work+0x40/0x40
[Â 138.973142]Â [<ffffffff81aad9f8>] ret_from_fork+0x58/0x90
[Â 138.973843]Â [<ffffffff8106da00>] ? insert_kthread_work+0x40/0x40

The solution is to check whether the thresholds associated with the eventfd
has been cleared when deleting the event. If so, we do nothing.

Signed-off-by: Chunguang Xu <brookxu@xxxxxxxxxxx>
---
Âmm/memcontrol.c | 10 ++++++++--
Â1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index d09776c..4575a58 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4027,7 +4027,7 @@ static void __mem_cgroup_usage_unregister_event(struct mem_cgroup *memcg,
ÂÂÂÂ struct mem_cgroup_thresholds *thresholds;
ÂÂÂÂ struct mem_cgroup_threshold_ary *new;
ÂÂÂÂ unsigned long usage;
-ÂÂÂ int i, j, size;
+ÂÂÂ int i, j, size, entries;
Â
ÂÂÂÂ mutex_lock(&memcg->thresholds_lock);
Â
@@ -4047,12 +4047,18 @@ static void __mem_cgroup_usage_unregister_event(struct mem_cgroup *memcg,
ÂÂÂÂ __mem_cgroup_threshold(memcg, type == _MEMSWAP);
Â
ÂÂÂÂ /* Calculate new number of threshold */
-ÂÂÂ size = 0;
+ÂÂÂ size = entries = 0;
ÂÂÂÂ for (i = 0; i < thresholds->primary->size; i++) {
ÂÂÂÂ ÂÂÂ if (thresholds->primary->entries[i].eventfd != eventfd)
ÂÂÂÂ ÂÂÂ ÂÂÂ size++;
+ÂÂÂ ÂÂÂ else
+ÂÂÂ ÂÂÂ ÂÂÂ entries++;
ÂÂÂÂ }
Â
+ÂÂÂ /* If items related to eventfd have been cleared, nothing to do */
+ÂÂÂ if (!entries)
+ÂÂÂ ÂÂÂ goto unlock;
+
ÂÂÂÂ new = thresholds->spare;
Â
ÂÂÂÂ /* Set thresholds array to NULL if we don't have thresholds */
--
1.8.3.1