[PATCH 4.4 28/28] tipc: fix crash during node removal

From: Greg Kroah-Hartman
Date: Tue Apr 25 2017 - 11:31:07 EST


4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Jon Paul Maloy <jon.maloy@xxxxxxxxxxxx>

commit d25a01257e422a4bdeb426f69529d57c73b235fe upstream.

When the TIPC module is unloaded, we have identified a race condition
that allows a node reference counter to go to zero and the node instance
being freed before the node timer is finished with accessing it. This
leads to occasional crashes, especially in multi-namespace environments.

The scenario goes as follows:

CPU0:(node_stop) CPU1:(node_timeout) // ref == 2

1: if(!mod_timer())
2: if (del_timer())
3: tipc_node_put() // ref -> 1
4: tipc_node_put() // ref -> 0
5: kfree_rcu(node);
6: tipc_node_get(node)
7: // BOOM!

We now clean up this functionality as follows:

1) We remove the node pointer from the node lookup table before we
attempt deactivating the timer. This way, we reduce the risk that
tipc_node_find() may obtain a valid pointer to an instance marked
for deletion; a harmless but undesirable situation.

2) We use del_timer_sync() instead of del_timer() to safely deactivate
the node timer without any risk that it might be reactivated by the
timeout handler. There is no risk of deadlock here, since the two
functions never touch the same spinlocks.

3: We remove a pointless tipc_node_get() + tipc_node_put() from the
timeout handler.

Reported-by: Zhijiang Hu <huzhijiang@xxxxxxxxx>
Acked-by: Ying Xue <ying.xue@xxxxxxxxxxxxx>
Signed-off-by: Jon Maloy <jon.maloy@xxxxxxxxxxxx>
Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>

---
net/tipc/node.c | 24 +++++++++++-------------
1 file changed, 11 insertions(+), 13 deletions(-)

--- a/net/tipc/node.c
+++ b/net/tipc/node.c
@@ -102,9 +102,10 @@ static unsigned int tipc_hashfn(u32 addr

static void tipc_node_kref_release(struct kref *kref)
{
- struct tipc_node *node = container_of(kref, struct tipc_node, kref);
+ struct tipc_node *n = container_of(kref, struct tipc_node, kref);

- tipc_node_delete(node);
+ kfree(n->bc_entry.link);
+ kfree_rcu(n, rcu);
}

void tipc_node_put(struct tipc_node *node)
@@ -216,21 +217,20 @@ static void tipc_node_delete(struct tipc
{
list_del_rcu(&node->list);
hlist_del_rcu(&node->hash);
- kfree(node->bc_entry.link);
- kfree_rcu(node, rcu);
+ tipc_node_put(node);
+
+ del_timer_sync(&node->timer);
+ tipc_node_put(node);
}

void tipc_node_stop(struct net *net)
{
- struct tipc_net *tn = net_generic(net, tipc_net_id);
+ struct tipc_net *tn = tipc_net(net);
struct tipc_node *node, *t_node;

spin_lock_bh(&tn->node_list_lock);
- list_for_each_entry_safe(node, t_node, &tn->node_list, list) {
- if (del_timer(&node->timer))
- tipc_node_put(node);
- tipc_node_put(node);
- }
+ list_for_each_entry_safe(node, t_node, &tn->node_list, list)
+ tipc_node_delete(node);
spin_unlock_bh(&tn->node_list_lock);
}

@@ -313,9 +313,7 @@ static void tipc_node_timeout(unsigned l
if (rc & TIPC_LINK_DOWN_EVT)
tipc_node_link_down(n, bearer_id, false);
}
- if (!mod_timer(&n->timer, jiffies + n->keepalive_intv))
- tipc_node_get(n);
- tipc_node_put(n);
+ mod_timer(&n->timer, jiffies + n->keepalive_intv);
}

/**