Re: IPF Montvale machine panic when running a network-releventtesting

From: Zhang, Yanmin
Date: Wed Jun 18 2008 - 01:12:34 EST



On Tue, 2008-06-17 at 20:37 -0700, David Miller wrote:
> From: "Zhang, Yanmin" <yanmin_zhang@xxxxxxxxxxxxxxx>
> Date: Wed, 18 Jun 2008 11:27:43 +0800
>
> > This issue is caused by tcp defer accept. Mostly, process context calls lock_sock
> > to apply a sleeping lock. BH (SoftIRQ) context calls bh_lock_sock(_nested) to just apply
> > for the sk->sk_lock.slock without sleeping, then do appropriate processing based on
> > if sk->sk_lock.owned==0. That works well if both process context and BH context operate
> > the same sk at the same time. But with ïtcp defer accept, it doesn't, because
> > process context(for example, in inet_csk_accept) locks the listen sk, while BH
> > context (in tcp_v4_rcv, for example) locks the child sk and calls
> > ïtcp_defer_accept_checkï => inet_csk_reqsk_queue_add => reqsk_queue_add, so there is a race
> > to access the listen sock.
> >
> > Below patch against 2.6.26-rc6 fixes the issue.
> >
> > ïïSigned-off-by: Zhang Yanmin <yanmin.zhang@xxxxxxxxx>
>
> We reverted the guilty defer accept changes, please test Linus's
> current tree.
I happened to download git tree on June 16th, which includes the reverting patch.

I confirm it fixes the hang issue.

-yanmin


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/