Re: netpoll trapped question

From: Jeff Moyer
Date: Tue Sep 07 2004 - 12:34:57 EST


==> Regarding Re: netpoll trapped question; Matt Mackall <mpm@xxxxxxxxxxx> adds:

mpm> On Tue, Aug 31, 2004 at 01:10:43PM -0400, Jeff Moyer wrote:
>> Hi, Matt,
>>
>> This part of the netpoll trapped logic seems suspect to me, from
>> include/linux/netdevice.h:
>>
>> static inline void netif_wake_queue(struct net_device *dev) { #ifdef
>> CONFIG_NETPOLL_TRAP if (netpoll_trap()) return; #endif if
>> (test_and_clear_bit(__LINK_STATE_XOFF, &dev->state))
>> __netif_schedule(dev); }
>>
>> static inline void netif_stop_queue(struct net_device *dev) { #ifdef
>> CONFIG_NETPOLL_TRAP if (netpoll_trap()) return; #endif
>> set_bit(__LINK_STATE_XOFF, &dev->state); }
>>
>> This looks buggy. Network drivers are now not able to stop the queue
>> when they run out of Tx descriptors. I think the __netif_schedule is
>> okay to do in the context of netpoll, and certainly a set_bit is okay.
>> Why are these hooks in place? I've tested alt-sysrq-t over netconsole
>> and also netdump with these #ifdef's removed, and things work correctly.
>> Compare this with alt-sysrq-t hanging the system with these tests in
>> place. If I run netdump with this logic still in place, I get the
>> following messages from the tg3
driver>
>>
eth0> BUG! Tx Ring full when queue awake!
>> Shall I send a patch, or have I missed something?

mpm> I don't remember the origin or motivation of this bit, so I'm not sure
mpm> at the moment. Shoot me a patch and I'll poke at it.

What tree would you like me to base the patch on? In absence of response,
it will be the latest -mm.

mpm> Btw, did I send you my thoughts on your recursion patch?

Yes, from your mail:

mpm> But I still don't like this. dev->poll() is liable to attempt to
mpm> recursively take its own driver lock again internally and we still
mpm> deadlock. Have we already seen recursion here? If we do, I think we
mpm> need to fix that in drivers. Meanwhile we should just bail here and
mpm> maybe set a "something bad happened" flag.

I'm not sure I follow. Which lock are you concerned about deadlocking on?
>From an earlier message, you say this:

mpm> It can [recurse] if the poll function does a printk or the like and
mpm> wants to recurse via netconsole.

So I think it's worth addressing in netpoll, as opposed to every driver.

-Jeff
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/