[ISSUE] Cannot enable VF after remove/rescan

From: Yicong Yang
Date: Fri Aug 19 2022 - 06:21:56 EST


Hi Ixgbe maintainers,

We met an issue that the VF of 82599 cannot be enabled after remove and rescan the PF device.
The PCI hierarchy on our platform is like:
[...]
+-[0000:80]-+-00.0-[81]--
| +-04.0-[82]--
| +-08.0-[83]--+-00.0 Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection
| | \-00.1 Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection
| \-10.0-[84]--
[...]

We operated like below:

[root@localhost ~]# cat /sys/class/net/enp131s0f0/device/sriov_numvfs
0
[root@localhost ~]# echo 1 > /sys/class/net/enp131s0f0/device/sriov_numvfs # enable 1 VF
[root@localhost ~]# echo 1 > /sys/bus/pci/devices/0000:83:00.0/remove # remove the PF
[root@localhost ~]# echo 1 > /sys/bus/pci/rescan # rescan the PF
[root@localhost ~]# cat /sys/class/net/enp131s0f0/device/sriov_numvfs
0
[root@localhost ~]# echo 1 > /sys/class/net/enp131s0f0/device/sriov_numvfs # attemp to enable the VF
[ 433.568996] ixgbe 0000:83:00.0 enp131s0f0: SR-IOV enabled with 1 VFs
[ 433.639027] ixgbe 0000:83:00.0: Multiqueue Enabled: Rx Queue count = 4, Tx Queue count = 4 XDP Queue count = 0
[ 433.652932] ixgbe 0000:83:00.0: can't enable 1 VFs (bus 84 out of range of [bus 83])
[ 433.661228] ixgbe 0000:83:00.0: Failed to enable PCI sriov: -12
-bash: echo: write error: Cannot allocate memory


A further investigation shows that the SRIOV offset changed after the rescan, so we cannot find
an available PCI bus (it's already occupied) for the VF device:

Before the remove:
[root@localhost ~]# lspci -vvs 83:00.0
Capabilities: [160 v1] Single Root I/O Virtualization (SR-IOV)
IOVCap: Migration- 10BitTagReq- Interrupt Message Number: 000
IOVCtl: Enable- Migration- Interrupt- MSE- ARIHierarchy+ 10BitTagReq-
IOVSta: Migration-
Initial VFs: 64, Total VFs: 64, Number of VFs: 0, Function Dependency Link: 00
VF offset: 128, stride: 2, Device ID: 10ed
Supported Page Size: 00000553, System Page Size: 00000001
Region 0: Memory at 0000280000804000 (64-bit, prefetchable)
Region 3: Memory at 0000280000904000 (64-bit, prefetchable)
VF Migration: offset: 00000000, BIR: 0

After the rescan:
[root@localhost ~]# lspci -vvs 83:00.0
Capabilities: [160 v1] Single Root I/O Virtualization (SR-IOV)
IOVCap: Migration- 10BitTagReq- Interrupt Message Number: 000
IOVCtl: Enable- Migration- Interrupt- MSE- ARIHierarchy- 10BitTagReq-
IOVSta: Migration-
Initial VFs: 64, Total VFs: 64, Number of VFs: 0, Function Dependency Link: 00
VF offset: 384, stride: 2, Device ID: 10ed
^^^^^^^^^^^^^^
offset has changed
Supported Page Size: 00000553, System Page Size: 00000001
Region 0: Memory at 0000280000804000 (64-bit, prefetchable)
Region 3: Memory at 0000280000904000 (64-bit, prefetchable)


We don't know why the SRIOV offset and stride changed and is there anything wrong. Any help on how
to illustrate or fix this is highly appreciated! Please let us know if more information is needed.

Thanks,
Yicong