Re: Re: [PATCH v2] PCI: Fix Oops caused by uninitialized slot->list in pci_slot_release()

From: zhongjubin
Date: Mon Dec 07 2020 - 04:53:16 EST


>On Wed, Dec 02, 2020 at 10:33:42AM +0800, Jubin Zhong wrote:
>> Once kobject_init_and_add() failed, pci_slot_release() is called to
>> delete slot->list from parent->slots. But slot->list is intialized
>> afterwards, so we ran into the following crash:
>>
>> Unable to handle kernel NULL pointer dereference at virtual address
>> 00000000
>> ...
>> CPU: 10 PID: 1 Comm: swapper/0 Not tainted 4.4.240 #197
>> task: ffffeb398a45ef10 task.stack: ffffeb398a470000
>> PC is at __list_del_entry_valid+0x5c/0xb0
>> LR is at pci_slot_release+0x84/0xe4
>> ...
>> __list_del_entry_valid+0x5c/0xb0
>> pci_slot_release+0x84/0xe4
>> kobject_put+0x184/0x1c4
>> pci_create_slot+0x17c/0x1b4
>> __pci_hp_initialize+0x68/0xa4
>> pciehp_probe+0x1a4/0x2fc
>> pcie_port_probe_service+0x58/0x84
>> driver_probe_device+0x320/0x470
>> __driver_attach+0x54/0xb8
>> bus_for_each_dev+0xc8/0xf0
>> driver_attach+0x30/0x3c
>> bus_add_driver+0x1b0/0x24c
>> driver_register+0x9c/0xe0
>> pcie_port_service_register+0x64/0x7c
>> pcied_init+0x44/0xa4
>> do_one_initcall+0x1d0/0x1f0
>> kernel_init_freeable+0x24c/0x254
>> kernel_init+0x18/0xe8
>> ret_from_fork+0x10/0x20
>>
>> Fixes: 8a94644b440e ("PCI: Fix pci_create_slot() reference count
>> leak")
>> Signed-off-by: Jubin Zhong <zhongjubin@xxxxxxxxxx>
>> Cc: stable@xxxxxxxxxxxxxxx #v4.4.235

>I'm curious how you noticed this. Did kobject_init_and_add() fail for some reason? Is there a bug there that we need to fix also?

Actually not.
In the very beginning, Coverity reported a RESOURCE_LEAK warning of slot memory. It should be a false alarm because pci_slot_release() is already called to free slot memory as a function hook (which Coverity could not recognize). After code review I prepared to cancel the warning, that's when I noticed the unintialized slot list is also deleted by pci_slot_release().

>And why did you mention v4.4.235 in the stable tag? 8a94644b440e wasn't merged until v5.9. If 8a94644b440e was backported to older kernels, we can't know that in general, so I think whoever backported it is responsible for watching for Fixes tags that mention it.

The Coverity warning was reported on v4.4.240. I also did a little test on v4.4.240 by forcing kobject_init_and_add() to fail, that's how I got the above call stack. Since 8a94644b440e was backported to v4.4.235, I thought it should be necessary to fix it on v4.4 too.

>>I updated the stable tag to "v5.9+" and applied this to pci/hotplug for v5.11, thanks!

>> ----
>> v2:
>> Since both slot memory and slot->list would be handled by
>> pci_slot_release(), we need to make sure slot->list is properly
>> initialized beforehand.
>>
>> v1: https://lore.kernel.org/lkml/1606288971-47927-1-git-send-email-zhongjubin@xxxxxxxxxx/
>> Two things need to be cleaned up on pci_create_slot's error path:
>> 1. free slot memory
>> 2. remove slot->list from its parent->slots
>> This patch mistakenly took slot memory as unfreed (which is not),
>> and would introduce double free problem.
>> ---
>> drivers/pci/slot.c | 6 +++---
>> 1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/pci/slot.c b/drivers/pci/slot.c index
>> 3861505..ed2077e 100644
>> --- a/drivers/pci/slot.c
>> +++ b/drivers/pci/slot.c
>> @@ -272,6 +272,9 @@ struct pci_slot *pci_create_slot(struct pci_bus *parent, int slot_nr,
>> goto err;
>> }
>>
>> + INIT_LIST_HEAD(&slot->list);
>> + list_add(&slot->list, &parent->slots);
>> +
>> err = kobject_init_and_add(&slot->kobj, &pci_slot_ktype, NULL,
>> "%s", slot_name);
>> if (err) {
>> @@ -279,9 +282,6 @@ struct pci_slot *pci_create_slot(struct pci_bus *parent, int slot_nr,
>> goto err;
>> }
>>
>> - INIT_LIST_HEAD(&slot->list);
>> - list_add(&slot->list, &parent->slots);
>> -
>> down_read(&pci_bus_sem);
>> list_for_each_entry(dev, &parent->devices, bus_list)
>> if (PCI_SLOT(dev->devfn) == slot_nr)
>> --
>> 1.8.5.6
>>