Re: [PATCH] pciehp: Add pciehp_surprise module option

From: Takashi Iwai
Date: Thu Apr 11 2013 - 09:41:01 EST


At Wed, 10 Apr 2013 11:19:48 -0600,
Bjorn Helgaas wrote:
>
> On Wed, Apr 10, 2013 at 10:34 AM, Takashi Iwai <tiwai@xxxxxxx> wrote:
> > Hi Bjorn,
> >
> > sorry for the late follow up as I was on vacation and has been busy
> > for other tasks. Since this topic went to nirvana, I try to whip
> > again...
> >
> > At Mon, 25 Mar 2013 10:58:40 -0600,
> > Bjorn Helgaas wrote:
> >>
> >> On Wed, Mar 20, 2013 at 8:02 AM, Takashi Iwai <tiwai@xxxxxxx> wrote:
> >> > We encountered a problem that on some HP machines the Realtek PCI-e
> >> > card reader device appears only when you inserted a card before the
> >> > cold boot. While debugging, it turned out that the device is actually
> >> > handled via PCI-e hotplug in some level. The device sends a presence
> >> > change notification, and pciehp receives it, but it's ignored because
> >> > of lack of the hotplug surprise (PCI_EXP_SLTCAP_HPS) capability bit.
> >> > Once when this check passes, everything starts working -- the device
> >> > appears upon plugging the card properly.
> >> >
> >> > There are a few other bug reports indicating the similar problems
> >> > (e.g. on recent Dell laptops), and I guess the culprit is same.
> >>
> >> Can you point us at these bug reports, e.g., with URLs? Hopefully
> >> they will contain complete dmesg logs and "lspci -vv" outputs so we
> >> can debug this a bit more.
> >
> > The machine isn't in market yet, so we cannot expose all things, but I
> > attach the lspci snippet of the relevant parts, pci-e ports and the
> > card reader, at least. If you need anything else, let me know.
> >
> > As Oliver and Michal already replied, Windows (both 7/8) identifies
> > the device without modification. This implies that Windows handles
> > the hotplug no matter whether the surprise bit is set or not, either
> > globally or device-specifically. But, since this is pretty new
> > hardware, we highly doubt it's done in a white-list basis.
> >
> >> I'm strongly opposed to adding a module option to work around this
> >> issue because the user experience is unacceptable. We can't expect
> >> users to debug the problem and discover the option.
> >>
> >> I'm also opposed to a DMI quirk system because I think it's very
> >> likely that this device works correctly under Windows, and I doubt
> >> very much that Windows has to include the equivalent of DMI quirks.
> >> So we should, at least in principle, be able to figure out how to make
> >> it work, too.
> >
> > In order to get things a bit straight, let me list up the things we
> > found again:
> >
> > - The Realtek card reader devices doesn't appear in lspci at the
> > fresh boot in multiple kernel versions from 3.0 to 3.9.
> >
> > - Once when the card is inserted, it issues the hotplug IRQ event.
> >
> > - pciehp receives and handles the event but it doesn't add/remove the
> > device actually because the corresponding controller has no surprise
> > bit.
> >
> > When forcibly enabling the hotplug device addition by my patch, it
> > starts working. The device is added at card insert. The removal
> > doesn't trigger on our system, but the event itself seems
> > generated.
> >
> > - The surprise bit can't be changed as it's supposed to be read-only
> > register bits. Thus no PCI quirk seems possible, and it has to be
> > fixed in pciehp.
> >
> > - Another way to detect the PCI card reader device is to perform
> > echo 1 > /sys/bus/pci/rescan
> > with a memory card inserted. It doesn't work without the card,
> > and it is less sophisticated than pciehp, of course.
> >
> > Right now, we applied a patch for pciehp to ignore the surprise bit
> > per basis of DMI string match. This works, but doesn't scale; if the
> > same problem happens on a similar model, the driver must be compiled
> > again. A module option would be really convenient for that, although
> > I understand your concern, too.
> >
> > Of course, an alternative (and more radical) solution is to remove the
> > surprise bit check completely from pciehp, as Matthew suggested in the
> > thread. What risk would it bring?
>
> I think we need to ignore the surprise bit as Matthew suggested.

OK, this is certainly a cleaner way.

> Alex raised an issue with this (secondary bus reset causes a device
> remove followed by device add), so we'd have to work through that
> somehow. I think doing the remove/add would actually be more correct
> because we would be doing the whole device initialization after the
> reset. We currently save/restore some device state around the reset,
> but that's a piecemeal approach that is certain to miss internal
> hidden state. But I don't know how to deal with the KVM implications
> of remove/add.

So is this specific to KVM?


In anyway, if any test patch is available, let me (or Oliver, Michal)
know. We'll happily try on the machine showing the bug.


thanks,

Takashi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/