Re: USB regression (and other failures) in 2.6.2[45]*

From: Alan Stern
Date: Sat Feb 16 2008 - 18:11:54 EST


On Sat, 16 Feb 2008, Andrew Buehler wrote:

> > For another, getting two copies of a message is no big deal --
>
> I disagree.

Everyone has his own taste. Obviously there's no world-wide consensus,
possibly because different people have different workflow habits and so
are affected by duplicate messages to varying extents.

> When I receive a message sent directly to me in a discussion which is on
> a list, I expect that it is because someone either considered it
> important enough to warrant making certain it came to my attention
> specifically, or wanted to continue the discussion but felt that it
> should not continue to take place on the mailing list.

Sometimes that is the case but often it isn't. Your expectations are
at variance with other people's behavior; you shouldn't expect everyone
else to change just to match your personal ideals.

On the other hand, I would be perfectly happy to edit your name out of
the reply list -- but since you said you aren't receiving all the
messages in this thread via the list that might not be a good thing to
do at the moment...


> > People on LKML who are more familiar with interrupt routing problems
> > might be able to offer more help. For now, you can try things like
> > turning on CONFIG_USB_DEBUG, posting the output from dmesg, posting
> > the contents of /proc/interrupts (say before and after a new USB
> > device is plugged in).
>
> In my current testing kernel, which I believe is the one with which I
> captured the sole successful non-2.6.23.1 dmesg so far, CONFIG_USB_DEBUG
> is on. The associated dmesg (obtained yesterday from booting with the
> Flash drive connected) is attached. (The flood of 'no version magic,
> tainting kernel' messages between line 600 and line 1160 are a side
> effect of Novell's custom environment which I have not yet made the
> effort to fix; the boot scripts attempt to detect the network card by
> modprobing every network driver available until they find one which
> works. Here, because the correct one fails, they wind up trying each one
> twice.)

The line saying:

> ehci_hcd 0000:00:1d.7: Unlink after no-IRQ? Controller is probably using the wrong IRQ.

is an indication that interrupt routing is indeed not working right.
Or possibly your EHCI controller isn't working. You could try
blacklisting or unloading ehci-hcd to see if that helps. Of course
then none of your USB devices would be able to run at high speed.

> I have transcribed the contents of /proc/interrupts both before and
> after plugging in the Flash drive I have been using for testing, and
> they are also attached. I have been as careful as I could to be sure
> that the contents of the attached 'r61-interrupts-[12].txt' files is the
> same as what was shown on the laptop, but cannot absolutely guarantee
> that I have not missed something. For the record, the '1' is from before
> connecting the drive, and the '2' is from after.

Notice that the interrupt count for IRQ 11 doesn't change when you plug
in the device. Obviously something is wrong there.

In fact, it's a little surprising that almost all the USB controllers
are routed to the same IRQ. However this is beyond my area of
expertise. You could try posting a message on the linux-acpi mailing
list; the people there should know a lot more about these issues.


> > Assuming that the 2.6.23 kernel works on your computer, you can go
> > the extreme route of installing git and doing a bisection to find the
> > first patch causing your difficulty.
>
> That would require me to learn enough of how git works, as distinct from
> more traditional VCSes, to be able to use it with some confidence. This
> is not impossible - indeed I want to do it at some point - but for the
> time being I have no idea where to start, and indeed I am not especially
> clear on exactly what (from a user's perspective) the differences been
> git and e.g. CVS or Subversion are. I know that the entire concept
> relies around a lack of centralization, but I have not been able to get
> my head around what that means in a practical sense.

There are some excellent tutorials on the web, with detailed
explanations of how to do a bisection to track down a kernel bug.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/