Re: [RFC PATCH 00/10] PCI core learns 'hotplug'

From: Trent Piepho
Date: Thu Jan 29 2009 - 05:44:54 EST


On Wed, 28 Jan 2009, Alex Chiang wrote:
> A while ago, Darrick Wong posted a patch for fakephp that kicked off
> some controversy:
>
> http://thread.gmane.org/gmane.linux.kernel/761944
>
> The issue was that I broke the fakephp interface back in the 2.6.27
> timeframe. After some discussion on the lists, Trent Piepho sent some
> patches, and I proposed a solution incorporating those patches.
>
> This is my first cut at making everyone happy. In summary, it:
>
> - introduces /sys/bus/pci/devices/.../remove for function level
> hot-remove
>
> - introduces /sys/bus/pci/devices/.../rescan to rescan the PCI
> hierarchy, starting at that device and descending to all children
>
> - introduces /sys/bus/pci/rescan to rescan the entire PCI hierarchy
>
> - restores the pre-2.6.27 fakephp interface for userspace compatability

I also continued to work on my patches, but then my reasons for caring
about PCI hotplug disappeared due to the current economic climate.

I updated my "remove" patch to include documentation. I created a patch
that added "/sys/bus/pci/scan", but not the per-device version. And I
updated my new fakephp driver to support rescanning.

Everything worked, but when a bridge was rescanned there would be annoying
warning messages. I never got around to figuring about what to do about
that. It seems like the code that assigns bridge resources wasn't intended
to handle bridges that already had resources assigned to them, though it
does work.

Maybe your series can use my latest patches for removal and legacy_fakephp?
It sounds like your patches for rescanning do more than mine.


> - I've been testing this patchset on my ia64 machines, which Linus
> has called "an insane mess of PCI bridges"[1], and it seems to
> work well. I'm just starting to test on some x86 machines, and
> have been noticing some issues with BAR collisions, so this is
> definitely a work-in-progress.

Does it not work, or is it just warnings? I didn't have any problems with
resources ending up unassigned, but I did get warnings. I think there was
also an issue with removed and rescanned devices' resources' ->parent
pointers not being the same as they were before removal. Which doesn't
seem to matter any, but made me feel like the code wasn't right yet.

> If you use the new PCI core removal/rescan and then try to modify
> the slot using acpiphp, you get an oops. My impression is that
> this behavior is the same as pre-2.6.27, where you could have
> loaded fakephp and acpiphp, removed the device with fakephp,
> and encountered an oops with acpiphp.

I came to the same conclusion. fakephp or acpiphp will oops if you use the
other to remove a pci device. The drivers just aren't designed to handle a
pci device being removed out from under them.

> So, I'm not sure what to do about this. The way that we remove
> devices today, using pci_remove_bus_device() doesn't lend itself
> to safety very well, since it will just start removing devices
> from the bus without checking anything.
>
> Maybe we need some other API, or maybe we just live with the
> limitation of, "if you use PCI core hotplug, don't use the
> other hotplug drivers and vice versa".

My new fakephp driver seems to handle this ok, maybe other php drivers
could do the same thing?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/