[PATCH] rcu: Fix NOCB follower not waking up causing OOM

From: David Chen
Date: Mon Jul 30 2018 - 18:25:00 EST


From: Chunwei Chen <david.chen@xxxxxxxxxxx>

In 4.9 stable branch, we hit an issue where one of the NOCB follower
thread wasn't waking up, even though the follower list is not empty.
The follower list just kept on growing and never got reclaimed, and
finally caused the system to run out of memory.

This issue is similar to the issue fixed by 6b5fc3a13318, both are
caused by lacking proper memory barrier before swake_up, and causing
wake up to be missed. While we do have smp_mb__after_atomic here, but
smp_mb__after_atomic is only a compiler barrier on x86, so it doesn't
prevent this at all. So we fix this issue by changing it to smp_mb.

Note, this patch doesn't apply to master, because in master the follower
list is changed to spinlock protected list. So there's no such issue
there.

Signed-off-by: Chunwei Chen <david.chen@xxxxxxxxxxx>
Cc: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
---
kernel/rcu/tree_plugin.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 554ea54..b2d663d 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -2090,7 +2090,7 @@ static void nocb_leader_wait(struct rcu_data *my_rdp)
/* Append callbacks to follower's "done" list. */
tail = xchg(&rdp->nocb_follower_tail, rdp->nocb_gp_tail);
*tail = rdp->nocb_gp_head;
- smp_mb__after_atomic(); /* Store *tail before wakeup. */
+ smp_mb(); /* Store *tail before wakeup. */
if (rdp != my_rdp && tail == &rdp->nocb_follower_head) {
/*
* List was empty, wake up the follower.
--
1.9.4