This race was triggered by a 'conntrack -F' command running in parallel
to the insertion of a hash for a new connection. Losing this race led to
a dead conntrack entry effectively blocking traffic for a particular
connection until timeout or flushing the conntrack hashes again.
Now the check for an already dying connection is done inside the lock.
Signed-off-by: Joerg Marx <joerg.marx@secunet.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
int ret = NF_ACCEPT;
if (ct && ct != &nf_conntrack_untracked) {
int ret = NF_ACCEPT;
if (ct && ct != &nf_conntrack_untracked) {
- if (!nf_ct_is_confirmed(ct) && !nf_ct_is_dying(ct))
+ if (!nf_ct_is_confirmed(ct))
ret = __nf_conntrack_confirm(skb);
if (likely(ret == NF_ACCEPT))
nf_ct_deliver_cached_events(ct);
ret = __nf_conntrack_confirm(skb);
if (likely(ret == NF_ACCEPT))
nf_ct_deliver_cached_events(ct);
spin_lock_bh(&nf_conntrack_lock);
spin_lock_bh(&nf_conntrack_lock);
+ /* We have to check the DYING flag inside the lock to prevent
+ a race against nf_ct_get_next_corpse() possibly called from
+ user context, else we insert an already 'dead' hash, blocking
+ further use of that particular connection -JM */
+
+ if (unlikely(nf_ct_is_dying(ct))) {
+ spin_unlock_bh(&nf_conntrack_lock);
+ return NF_ACCEPT;
+ }
+
/* See if there's one in the list already, including reverse:
NAT could have grabbed it without realizing, since we're
not in the hash. If there is, we lost race. */
/* See if there's one in the list already, including reverse:
NAT could have grabbed it without realizing, since we're
not in the hash. If there is, we lost race. */