Re: [REGRESSION] 2.6.24/25: random lockups when accessing external USB harddrive

From: David Brownell
Date: Fri Jun 27 2008 - 12:08:24 EST


On Thursday 26 June 2008, Stefan Becker wrote:
> it seems that the following code paths have the interrupts
> enabled when calling usb_hcd_unlink_urb_from_ep():
>
>    [<c0574d9d>] usb_hcd_unlink_urb_from_ep+0x25/0x6b
>    [<de850559>] uhci_giveback_urb+0xcd/0x1e3 [uhci_hcd]
>    [<de850e02>] uhci_scan_schedule+0x511/0x720 [uhci_hcd]
> ...
>    [<de8529c3>] uhci_irq+0x131/0x142 [uhci_hcd]
>    [<c05750cb>] usb_hcd_irq+0x23/0x51

I'll let Alan look at that one, but:


> and
>
>    [<c0574d9d>] usb_hcd_unlink_urb_from_ep+0x25/0x6b
>    [<de839d55>] ehci_urb_done+0x73/0x92 [ehci_hcd]
>    [<de83a92f>] qh_completions+0x373/0x3eb [ehci_hcd]
>    [<de83aa43>] ehci_work+0x9c/0x6a9 [ehci_hcd]
> ...
>    [<de83ec3c>] ehci_irq+0x241/0x265 [ehci_hcd]
> ...
>    [<c05750cb>] usb_hcd_irq+0x23/0x51
>
>
> Is that enough information to fix the problem?

No, but it suggests a few ways to get closer to the root cause.

That looks fishy. The IRQ handler does not re-enable IRQs,
so it looks like something re-enabled IRQs in the middle of
an IRQ handler! Which will obviously cause trouble.

(I'll assume that this isn't a case of a misleading stack dump,
where the IRQ frames are dead ones that were wrongly dumped...)

If 442258e2ff69276ff767f3703b30ce6a31fdd181 isn't in the
kernel with those bugs, try applying it... else, I suggest
you try putting something like your

if (!raw_irqs_disabled()) {
printk(KERN_CRIT "interrupts enabled!\n");
dump_stack();
        }

logic at the beginning and end of usb_hcd_giveback_urb ... and
maybe have it dump the address of the completion handler, when
you can finger such a handler as re-enabling IRQs.

- Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/