Re: Soft lockup on shutdown in nf_ct_iterate_cleanup()

From: Patrick McHardy
Date: Wed Feb 28 2007 - 13:09:33 EST


Chuck Ebbert wrote:
> Patrick McHardy wrote:
>
>>This patch is against current stable-2.6.20, it applies
>>cleanly to 2.6.20 as well.
>
>
> Everything works OK, but I get:
>
> BUG: warning: (!list_empty(&unconfirmed)) at
> net/netfilter/nf_conntrack_core.c:1068/nf_ct_cleanup()
>
> <> nf_ct_cleanup+0x66/0x122 [nf_conntrack]
> <> kill_l4proto+0x0 [nf_conntrack]
> <> nf_conntrack_l4proto_unregister+0x7d/0x82 [nf_conntrack]
> <> nf_conntrack_l3proto_ipv4_fini+0x3c/0x46 [nf_conntrack_ipv4]
> <> sys_delete_module+0x18a/0x1b1


Thanks, the previous approach doesn't seem to work properly without
unpleasant event cache hacks. This patch takes a simpler approach
and keeps the unconfirmed list iteration, but makes sure to make
forward progress.

[NETFILTER]: conntrack: fix {nf,ip}_ct_iterate_cleanup endless loops

Fix {nf,ip}_ct_iterate_cleanup unconfirmed list handling:

- unconfirmed entries can not be killed manually, they are removed on
confirmation or final destruction of the conntrack entry, which means
we might iterate forever without making forward progress.

This can happen in combination with the conntrack event cache, which
holds a reference to the conntrack entry, which is only released when
the packet makes it all the way through the stack or a different
packet is handled.

- taking references to an unconfirmed entry and using it outside the
locked section doesn't work, the list entries are not refcounted and
another CPU might already be waiting to destroy the entry

What the code really wants to do is make sure the references of the hash
table to the selected conntrack entries are released, so they will be
destroyed once all references from skbs and the event cache are dropped.

Since unconfirmed entries haven't even entered the hash yet, simply mark
them as dying and skip confirmation based on that.

Signed-off-by: Patrick McHardy <kaber@xxxxxxxxx>

---
commit 841de8621862a5406d2a236dd142a5e5db167d25
tree 347d137f0d6466f0b4da83256bfbeaa4150f105a
parent 8d1117a9f5d302d8d460fbe7ef322b382e45c9ce
author Patrick McHardy <kaber@xxxxxxxxx> Mon, 26 Feb 2007 18:48:05 +0100
committer Patrick McHardy <kaber@xxxxxxxxx> Wed, 28 Feb 2007 19:02:00 +0100

include/linux/netfilter_ipv4/ip_conntrack_core.h | 2 +-
include/net/netfilter/nf_conntrack_core.h | 2 +-
net/ipv4/netfilter/ip_conntrack_core.c | 2 +-
net/netfilter/nf_conntrack_core.c | 2 +-
4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/netfilter_ipv4/ip_conntrack_core.h b/include/linux/netfilter_ipv4/ip_conntrack_core.h
index 907d4f5..e3a6df0 100644
--- a/include/linux/netfilter_ipv4/ip_conntrack_core.h
+++ b/include/linux/netfilter_ipv4/ip_conntrack_core.h
@@ -45,7 +45,7 @@ static inline int ip_conntrack_confirm(s
int ret = NF_ACCEPT;

if (ct) {
- if (!is_confirmed(ct))
+ if (!is_confirmed(ct) && !is_dying(ct))
ret = __ip_conntrack_confirm(pskb);
ip_ct_deliver_cached_events(ct);
}
diff --git a/include/net/netfilter/nf_conntrack_core.h b/include/net/netfilter/nf_conntrack_core.h
index 7fdc72c..85634e1 100644
--- a/include/net/netfilter/nf_conntrack_core.h
+++ b/include/net/netfilter/nf_conntrack_core.h
@@ -64,7 +64,7 @@ static inline int nf_conntrack_confirm(s
int ret = NF_ACCEPT;

if (ct) {
- if (!nf_ct_is_confirmed(ct))
+ if (!nf_ct_is_confirmed(ct) && !nf_ct_is_dying(ct))
ret = __nf_conntrack_confirm(pskb);
nf_ct_deliver_cached_events(ct);
}
diff --git a/net/ipv4/netfilter/ip_conntrack_core.c b/net/ipv4/netfilter/ip_conntrack_core.c
index 8556a4f..f8b3009 100644
--- a/net/ipv4/netfilter/ip_conntrack_core.c
+++ b/net/ipv4/netfilter/ip_conntrack_core.c
@@ -1242,7 +1242,7 @@ get_next_corpse(int (*iter)(struct ip_co
list_for_each_entry(h, &unconfirmed, list) {
ct = tuplehash_to_ctrack(h);
if (iter(ct, data))
- goto found;
+ set_bit(IPS_DYING_BIT, &ct->status);
}
write_unlock_bh(&ip_conntrack_lock);
return NULL;
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 9b02ec4..cb29ba7 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -1052,7 +1052,7 @@ get_next_corpse(int (*iter)(struct nf_co
list_for_each_entry(h, &unconfirmed, list) {
ct = nf_ct_tuplehash_to_ctrack(h);
if (iter(ct, data))
- goto found;
+ set_bit(IPS_DYING_BIT, &ct->status);
}
write_unlock_bh(&nf_conntrack_lock);
return NULL;