Re: [PATCH 1/2] platform/x86: ISST: Optimize CPU to PCI device mapping

From: Hans de Goede
Date: Thu Jun 17 2021 - 07:37:09 EST


Hi Srinivas,

On 6/17/21 12:13 AM, Srinivas Pandruvada wrote:
> It was observed that some of the high performance benchmarks are spending
> more time in kernel depending on which CPU package they are executing.
> The difference is significant and benchmark scores varies more than 10%.
> These benchmarks adjust class of service to improve thread performance
> which run in parallel. This class of service change causes access to
> MMIO region of Intel Speed Select PCI devices depending on the CPU
> package they are executing.
>
> This mapping from CPU to PCI device instance uses a standard Linux PCI
> interface "pci_get_domain_bus_and_slot()". This function does a linear
> search to get to a PCI device. Since these platforms have 100+ PCI
> devices, this search can be expensive in fast path for benchmarks.
>
> Since the device and function of PCI device is fixed for Intel
> Speed Select PCI devices, the CPU to PCI device information can be cached
> at the same time when bus number for the CPU is read. In this way during
> runtime the cached information can be used. This improves performance
> of these benchmarks significantly.
>
> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@xxxxxxxxxxxxxxx>
> ---
> .../intel_speed_select_if/isst_if_common.c | 29 +++++++++++++++----
> 1 file changed, 24 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/platform/x86/intel_speed_select_if/isst_if_common.c b/drivers/platform/x86/intel_speed_select_if/isst_if_common.c
> index 0c2aa22c7a12..aedb8310214c 100644
> --- a/drivers/platform/x86/intel_speed_select_if/isst_if_common.c
> +++ b/drivers/platform/x86/intel_speed_select_if/isst_if_common.c
> @@ -281,11 +281,27 @@ static int isst_if_get_platform_info(void __user *argp)
> struct isst_if_cpu_info {
> /* For BUS 0 and BUS 1 only, which we need for PUNIT interface */
> int bus_info[2];
> + struct pci_dev *pci_dev[2];
> int punit_cpu_id;
> };
>
> static struct isst_if_cpu_info *isst_cpu_info;
>
> +static struct pci_dev *_isst_if_get_pci_dev(int cpu, int bus_no, int dev, int fn)
> +{
> + int bus_number;
> +
> + if (bus_no < 0 || bus_no > 1 || cpu < 0 || cpu >= nr_cpu_ids ||
> + cpu >= num_possible_cpus())
> + return NULL;
> +
> + bus_number = isst_cpu_info[cpu].bus_info[bus_no];
> + if (bus_number < 0)
> + return NULL;
> +
> + return pci_get_domain_bus_and_slot(0, bus_number, PCI_DEVFN(dev, fn));
> +}
> +
> /**
> * isst_if_get_pci_dev() - Get the PCI device instance for a CPU
> * @cpu: Logical CPU number.
> @@ -300,17 +316,18 @@ static struct isst_if_cpu_info *isst_cpu_info;
> */
> struct pci_dev *isst_if_get_pci_dev(int cpu, int bus_no, int dev, int fn)
> {
> - int bus_number;
> + struct pci_dev *pci_dev;
>
> if (bus_no < 0 || bus_no > 1 || cpu < 0 || cpu >= nr_cpu_ids ||
> cpu >= num_possible_cpus())
> return NULL;
>
> - bus_number = isst_cpu_info[cpu].bus_info[bus_no];
> - if (bus_number < 0)
> - return NULL;
> + pci_dev = isst_cpu_info[cpu].pci_dev[bus_no];

If the _isst_if_get_pci_dev() call below fails, then pci_dev might
end up getting set to NULL here.

>
> - return pci_get_domain_bus_and_slot(0, bus_number, PCI_DEVFN(dev, fn));
> + if (pci_dev->devfn == PCI_DEVFN(dev, fn))

And then this would lead to a NULL ptr deref, I've replaced this
the above if with:

if (pci_dev && pci_dev->devfn == PCI_DEVFN(dev, fn))

to avoid this.

I've applied this series with the above change
to my review-hans branch:
https://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86.git/log/?h=review-hans

Note it will show up in my review-hans branch once I've pushed my
local branch there, which might take a while.

Once I've run some tests on this branch the patches there will be
added to the platform-drivers-x86/for-next branch and eventually
will be included in the pdx86 pull-request to Linus for the next
merge-window.

Regards,

Hans





> + return pci_dev;
> +
> + return _isst_if_get_pci_dev(cpu, bus_no, dev, fn);
> }
> EXPORT_SYMBOL_GPL(isst_if_get_pci_dev);
>
> @@ -327,6 +344,8 @@ static int isst_if_cpu_online(unsigned int cpu)
> } else {
> isst_cpu_info[cpu].bus_info[0] = data & 0xff;
> isst_cpu_info[cpu].bus_info[1] = (data >> 8) & 0xff;
> + isst_cpu_info[cpu].pci_dev[0] = _isst_if_get_pci_dev(cpu, 0, 0, 1);
> + isst_cpu_info[cpu].pci_dev[1] = _isst_if_get_pci_dev(cpu, 1, 30, 1);
> }
>
> ret = rdmsrl_safe(MSR_THREAD_ID_INFO, &data);
>