Re: Lockup with 2.6.9-ac15 related to netconsole

From: Patrick McHardy
Date: Wed Dec 22 2004 - 10:01:20 EST


Francois Romieu wrote:
Patrick McHardy <kaber@xxxxxxxxx> :
[...]

at least the queued messages ordered. But you need to grab
dev->queue_lock, otherwise you risk corrupting qdisc internal data.
You should probably also deal with the noqueue-qdisc, which doesn't
have an enqueue function. So it should look something like this:


If I am not mistaken, a failure on spin_trylock + the test on
xmit_lock_owner imply that it is safe to directly handle the
queue. It means that qdisc_run() has been interrupted on the
current cpu and the other paths seem fine as well. Counter-example
is welcome (no joke).

enqueue is only protected by dev->queue_lock, and dev->queue_lock
is dropped as soon as dev->xmit_lock is grabbed, so any other CPU
might call enqueue at the same time.

Example:

CPU1 CPU2

dev_queue_xmit dev_queue_xmit
lock(dev->queue_lock) lock(dev->queue_lock)
q->enqueue
qdisc_run
qdisc_restart
trylock(dev->xmit_lock), ok
unlock(dev->queue_lock)
...
printk("something")
...
netpoll_send_skb
trylock(dev->xmit_lock), fails
q->enqueue q->enqueue

Of course the patch is completely ugly and violates any layering
principle one could think of. It was not submitted for inclusion :o)

Sure, but I think we should have a short-term workaround until
a better solution has been invented. Maybe dropping the packets
would be best for now, it only affects printks issued in paths
starting at qdisc_restart (-> hard_start_xmit -> ...). Queueing
the packets might also cause reordering since not all packets
are queued.

while (!spin_trylock(&np->dev->xmit_lock)) {
if (np->dev->xmit_lock_owner == smp_processor_id()) {
struct Qdisc *q;

rcu_read_lock();
q = rcu_dereference(dev->qdisc);
if (q->enqueue) {
spin_lock(&dev->queue_lock);


I'd expect it to deadlock if dev_queue_xmit -> qdisc_run is interrupted
on the current cpu and a printk is issued as dev->queue_lock will have
been taken elsewhere.

Hmm this is complicated, let me think some more about it.

Regards
Patrick
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/