RE: [PATCH V2] perf/x86/intel/uncore: Querying number of CHAs from CAPID6 register

From: Kroening, Gary
Date: Tue Mar 13 2018 - 16:25:09 EST


I've tested this patch on the same set of hubless (single-segment) and scalable (segment-per-socket) configurations as for Kan's version 1.

As far as we can tell this will also work for Cascade Lake, but will need revisiting for Ice Lake.

Thanks.
Gary

> -----Original Message-----
> From: kan.liang@xxxxxxxxxxxxxxx [mailto:kan.liang@xxxxxxxxxxxxxxx]
> Sent: Tuesday, March 13, 2018 1:52 PM
> To: mingo@xxxxxxxxxx; hpa@xxxxxxxxx; tglx@xxxxxxxxxxxxx;
> peterz@xxxxxxxxxxxxx; andy.shevchenko@xxxxxxxxx
> Cc: Kroening, Gary; Travis, Mike; Banman, Andrew; Sivanich, Dimitri;
> Anderson, Russ; x86@xxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Kan Liang
> Subject: [PATCH V2] perf/x86/intel/uncore: Querying number of CHAs from
> CAPID6 register
>
> From: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
>
> The number of CHAs is miscalculated on multi PCI domain systems on
> Skylake server.
>
> (From Kroening, Gary:
>
> For systems with a single PCI segment, it is sufficient to look for the
> bus number to change in order to determine that all of the CHa's have
> been counted for a single socket.
> However, for multi PCI segment systems, each socket is given a new
> segment and the bus number does NOT change. So looking only for the
> bus number to change ends up counting all of the CHa's on all sockets
> in the system. This leads to writing CPU MSRs beyond a valid range and
> causes an error in ivbep_uncore_msr_init_box().)
>
> To determine the number of CHAs, it should read bits 27:0 in the CAPID6
> register located at Device 30, Function 3, Offset 0x9C. These 28 bits
> form a bit vector of available LLC slices and the CHAs that manage those
> slices.
>
> Fixes: cd34cd97b7b4 ("perf/x86/intel/uncore: Add Skylake server uncore
> support")
> Reported-by: Kroening, Gary <gary.kroening@xxxxxxx>
> Signed-off-by: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
> ---
>
> Changes since V1:
> - add missed pci_dev_put()
> - Drop ugly casting by using hweight32()
> - Add comments for macros.
>
> arch/x86/events/intel/uncore_snbep.c | 31 +++++++++++++++++--------------
> 1 file changed, 17 insertions(+), 14 deletions(-)
>
> diff --git a/arch/x86/events/intel/uncore_snbep.c
> b/arch/x86/events/intel/uncore_snbep.c
> index 6d8044a..8970f71 100644
> --- a/arch/x86/events/intel/uncore_snbep.c
> +++ b/arch/x86/events/intel/uncore_snbep.c
> @@ -3562,24 +3562,27 @@ static struct intel_uncore_type *skx_msr_uncores[]
> = {
> NULL,
> };
>
> +/*
> + * To determine the number of CHAs, it should read bits 27:0 in the
> CAPID6
> + * register which located at Device 30, Function 3, Offset 0x9C. PCI ID
> 0x2083.
> + */
> +#define SKX_CAPID6 0x9c
> +#define SKX_CHA_BIT_MASK GENMASK(27, 0)
> +
> static int skx_count_chabox(void)
> {
> - struct pci_dev *chabox_dev = NULL;
> - int bus, count = 0;
> + struct pci_dev *dev = NULL;
> + u32 val = 0;
>
> - while (1) {
> - chabox_dev = pci_get_device(PCI_VENDOR_ID_INTEL, 0x208d,
> chabox_dev);
> - if (!chabox_dev)
> - break;
> - if (count == 0)
> - bus = chabox_dev->bus->number;
> - if (bus != chabox_dev->bus->number)
> - break;
> - count++;
> - }
> + dev = pci_get_device(PCI_VENDOR_ID_INTEL, 0x2083, dev);
> + if (!dev)
> + goto out;
>
> - pci_dev_put(chabox_dev);
> - return count;
> + pci_read_config_dword(dev, SKX_CAPID6, &val);
> + val &= SKX_CHA_BIT_MASK;
> +out:
> + pci_dev_put(dev);
> + return hweight32(val);
> }
>
> void skx_uncore_cpu_init(void)
> --
> 2.7.4