Re: [RFC][PATCH] netconsole: avoid deadlock on printk from drivercode

From: David Miller
Date: Wed Aug 13 2008 - 06:29:58 EST


From: Alexey Dobriyan <adobriyan@xxxxxxxxx>
Date: Wed, 13 Aug 2008 13:59:43 +0400

> On Wed, Aug 13, 2008 at 11:53:24AM +0200, Vegard Nossum wrote:
> > I encountered a hard-to-debug deadlock when I pulled out the plug of my
> > RealTek 8139 which was also running netconsole: The driver wants to print
> > a "link down" message. However, this triggers netconsole, which wants to
> > print the message using the same device. Here is a backtrace:
> >
> > [<c05916b6>] _spin_lock_irqsave+0x76/0x90
> > [<c035b255>] rtl8139_start_xmit+0x65/0x130 <-- spin_lock(&tp->lock)
> > [<c04c5e28>] netpoll_send_skb+0x158/0x1a0
> > [<c04c62fb>] netpoll_send_udp+0x1db/0x1f0
> > [<c037c70c>] write_msg+0x8c/0xc0
> > [<c0135883>] __call_console_drivers+0x53/0x60
> > [<c01358db>] _call_console_drivers+0x4b/0x90
> > [<c0135a25>] release_console_sem+0xc5/0x1f0
> > [<c0135f0b>] vprintk+0x1ab/0x3e0
> > [<c013615b>] printk+0x1b/0x20
> > [<c0349736>] mii_check_media+0x196/0x1e0
> > [<c03597f4>] rtl_check_media+0x24/0x30
> > [<c035a0ea>] rtl8139_interrupt+0x42a/0x4a0 <-- spin_lock(&tp->lock)
> > [<c01716d8>] handle_IRQ_event+0x28/0x70
> > [<c0172d9b>] handle_fasteoi_irq+0x6b/0xe0
> > [<c0107128>] do_IRQ+0x48/0xa0
> >
> > The least invasive fix is to detect that we're trying to re-enter the
> > driver code. We provide a netdev_busy() function which can be used to
> > determine whether a deadlock can occur if we try to transmit another
> > packet.
> >
> > Note that this may lead to lost messages if the driver is active on
> > another CPU while we try to use the same device for netconsole.
>
> This sucks.

It's also the wrong fix.

As a quicker and more palatable solution, print your link status
message in some kind of deferred context where you can have the
lock not held or similar.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/