Re: [Question] Any plan to support enable PCI SRIOV concurrently in kernel side?

From: Bjorn Helgaas
Date: Wed Aug 17 2022 - 15:49:57 EST


On Wed, Aug 17, 2022 at 07:43:34AM +0000, Zhoujian (jay) wrote:
> Hi,
>
> Enable SRIOV concurrently with many different PFs in userspace seems workable.
> I'm trying to do it with 8 PFs(each one with 240+ VFs), but get some warnings,
> here is the backtrace:

This definitely seems like a problem that should be fixed. If you
have a script that can reproduce it, that might help people work on
it. If you can reproduce it in qemu, that would be even better.

Some comments on the patch below.

> Warning 1:
> ---
> sysfs: cannot create duplicate filename '/devices/pci0000:30/0000:30:02.0/pci_bus/0000:32'
> Call Trace:
> dump_stack+0x6f/0xab
> sysfs_warn_dup+0x56/0x70
> sysfs_create_dir_ns+0x80/0x90
> kobject_add_internal+0xa0/0x2b0
> kobject_add+0x71/0xd0
> device_add+0x126/0x630
> pci_add_new_bus+0x17c/0x4b0
> pci_iov_add_virtfn+0x336/0x390
> sriov_enable+0x26e/0x450
> virtio_pci_sriov_configure+0x61/0xc0 [virtio_pci]
> ---
> The reason is that different VFs may create the same pci bus number
> and try to add new bus concurrently in virtfn_add_bus.
>
> Warning 2:
> ---
> proc_dir_entry 'pci/33' already registered
> WARNING: CPU: 71 PID: 893 at fs/proc/generic.c:360 proc_register+0xf8/0x130
> Call Trace:
> proc_mkdir_data+0x5d/0x80
> pci_proc_attach_device+0xe9/0x120
> pci_bus_add_device+0x33/0x90
> pci_iov_add_virtfn+0x375/0x390
> sriov_enable+0x26e/0x450
> virtio_pci_sriov_configure+0x61/0xc0 [virtio_pci]
> ---
> The reason is that different VFs may create '/proc/bus/pci/bus_number'
> directory using the same bus number in pci_proc_attach_device concurrently.
>
> Mutex lock can avoid potential conflict. With the patch below the warnings above
> are no longer appear.
>
> My question is that any plan to support enable PCI SRIOV concurrently in kernel side?
>
> Thanks
>
> ---
> drivers/pci/iov.c | 29 +++++++++++++++++++++++++++--
> 1 file changed, 27 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
> index 952217572113..6a8a849298c4 100644
> --- a/drivers/pci/iov.c
> +++ b/drivers/pci/iov.c
> @@ -16,6 +16,12 @@
>
> #define VIRTFN_ID_LEN 16
>
> +static struct mutex add_bus_mutex;
> +static int add_bus_mutex_initialized;
> +
> +static struct mutex add_device_mutex;
> +static int add_device_mutex_initialized;
> +
> int pci_iov_virtfn_bus(struct pci_dev *dev, int vf_id)
> {
> if (!dev->is_physfn)
> @@ -127,13 +133,24 @@ static struct pci_bus *virtfn_add_bus(struct pci_bus *bus, int busnr)
> if (bus->number == busnr)
> return bus;
>
> + if (!add_bus_mutex_initialized) {
> + mutex_init(&add_bus_mutex);
> + add_bus_mutex_initialized = 1;
> + }

I assume this patch works around the warning. I see the intent here,
but I think would need some rework before merging it. These locks
protect pci_add_new_bus() and pci_bus_add_device(), but only for the
callers in iov.c. These interfaces are both called from places other
than iov.c, and any mutual exclusion should cover all of them.

I'm actually not sure how the other callers are protected. I assume
we're holding a device_lock for some device farther up the chain. Or,
I see that acpi_pci_root_add() and rescan_store() hold
pci_rescan_remove_lock while calling these. We don't seem to hold
that uniformly though, which bothers me, because I think there are
many other paths, e.g., pci_host_probe() and its callers.

> + mutex_lock(&add_bus_mutex);
> +
> child = pci_find_bus(pci_domain_nr(bus), busnr);
> - if (child)
> + if (child) {
> + mutex_unlock(&add_bus_mutex);
> return child;
> + }
>
> child = pci_add_new_bus(bus, NULL, busnr);
> - if (!child)
> + if (!child) {
> + mutex_unlock(&add_bus_mutex);
> return NULL;
> + }
> + mutex_unlock(&add_bus_mutex);
>
> pci_bus_insert_busn_res(child, busnr, busnr);
>
> @@ -339,8 +356,16 @@ int pci_iov_add_virtfn(struct pci_dev *dev, int id)
> if (rc)
> goto failed1;
>
> + if (!add_device_mutex_initialized) {
> + mutex_init(&add_device_mutex);
> + add_device_mutex_initialized = 1;
> + }
> + mutex_lock(&add_device_mutex);
> +
> pci_bus_add_device(virtfn);
>
> + mutex_unlock(&add_device_mutex);
> +
> return 0;
>
> failed1:
> ---