Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

From: Ingo Molnar
Date: Thu Jul 19 2007 - 06:48:20 EST



* Olaf Kirch <olaf.kirch@xxxxxxxxxx> wrote:

> On Thursday 19 July 2007 12:01, Ingo Molnar wrote:
> > Calling initcall 0xc0603f55: netpoll_init+0x0/0x39()
> > initcall 0xc0603f55: netpoll_init+0x0/0x39() returned 0.
> > initcall 0xc0603f55 ran for 0 msecs: netpoll_init+0x0/0x39()
> > Calling initcall 0xc0604257: netlink_proto_init+0x0/0x12a()
> > NET: Registered protocol family 16
> >
> > and no output ever since - and the box has been up for a few minutes.
>
> Okay, I need to ask a stupid question - did you verify that it's not
> spinning on a spinlock?

the box is fully usable after it has booted up and (as you can see it in
the config i've uploaded) i've got every kernel debug option enabled,
including lockdep.

> Specifically, I'm wondering whether the net_rx_action softirq may be
> scheduled while we're in poll_napi holding the poll_lock.
> net_rx_action would try to take the poll_lock as well, and we'd be
> hung for good. The patch with local_bh_disable/enable was supposed to
> test that idea (this is the "trickle" patch)

the box isnt hung - it just has no networking. (i couldnt get the logs
out if it were hung - netconsole is the only remote debugging option and
with that broken i can only get info out if it boots up fine)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/