Re: [PATCH v2 2/2] PCI: pciehp: Abort hot-plug if pci_hp_add_bridge() fails

From: Lukas Wunner
Date: Mon May 06 2024 - 15:37:04 EST


On Mon, May 06, 2024 at 10:37:01AM +0200, Nam Cao wrote:
> I just discovered a new crash scenario with shpchp:
>
> First, hot-add a bridge:
> (qemu) device_add pci-bridge,id=br2,bus=br1,chassis_nr=1,addr=1
>
> But with the current patch, the hot-plug is still successful, just that the
> bridge is not properly configured:
> $ lspci -t
> -[0000:00]-+-00.0
> +-01.0
> +-02.0
> +-03.0-[01-02]----00.0-[02]----01.0-- <-- new hot-added bridge
> +-1f.0
> +-1f.2
> \-1f.3
>
> But! Now I leave the hot-added bridge as is, and hot-add an ethernet card:
> (qemu) device_add e1000,bus=br1,id=eth0,addr=2
>
> this command crashes the kernel (full crash log below).
>
> The problem is because during hot-plugging of this ethernet card,
> pci_bus_add_devices() is invoked and the previously hot-plugged bridge hasn't
> been marked as "added" yet, so the driver attempts to add this bridge
> again, leading to the crash.
>
> Now for pciehp, this scenario cannot happen, because there is only a single
> slot, so we cannot hot-plug another device. But still, this makes me think
> perhaps we shouldn't leave the hot-plugged bridge in a improperly-configured
> state as this patch is doing. I cannot think of a scenario that would crash pciehp
> similarly to shpchp. But that's just me, maybe there is a rare scenario
> that can crash pciehp, but I just haven't seen yet.
>
> My point is: better safe than sorry. I propose going back to the original
> solution: calling pci_stop_and_remove_bus_device() and return a negative
> error, and get rid of this improperly configured bridge. It is useless
> anyways without a subordinate bus number.
>
> What do you think?

When the kernel runs out of BAR address space for devices downstream of
a bridge, it doesn't de-enumerate the bridge. The devices are just
unusable.

So my first intuition is that running out of bus numbers shouldn't result
in de-enumeration either.

Remind me, how exactly does the NULL pointer deref occur? I think it's
because no struct pci_bus was allocated for the subordinate bus of the
hot-plugged bridge, right? Because AFAICS that would happen in

pci_hp_add_bridge()
pci_can_bridge_extend()
pci_add_new_bus()
pci_alloc_child_bus()

but we never get that far because pci_hp_add_bridge() bails out with -1.
So the subordinate pointer remains a NULL pointer.

Perhaps we should allocate an empty struct pci_bus instead.
Or check for a NULL subordinate pointer instead of crashing.

I can't really say off the top of my head which solution is appropriate
here, it requires going through the code paths and identifying the
complexity associated with each solution.

Thanks,

Lukas