Re: AMD EPYC Topology problems

From: Borislav Petkov
Date: Mon Dec 03 2018 - 06:14:08 EST


On Sun, Dec 02, 2018 at 08:23:05PM +0000, Andrew Cooper wrote:
> Hello,
>
> I have dual socket server with the following processor:
>
> [root@xrtmia-09-01 ~]# head /proc/cpuinfo
> processor : 0
> vendor_id : AuthenticAMD
> cpu family : 23
> model : 1
> model name : AMD EPYC 7281 16-Core Processor
> stepping : 2
>
> Which has highlighted a issue in the topology derivation logic.Â
> (Actually, it was discovered with Xen, but we share the same topology
> infrastructure and the issue is also present with Linux).
>
> There are a total of 64 threads in the system, made of two 32-thread
> sockets. The APIC IDs for this system are sparse - they are 0x0-0x3,
> 0x8-0xb, 0x10-0x13 etc, all the way up to 0x7b.
>
> This is because the socket is made of 4 nodes with 4 cores each, but
> space has been left in the layout for the maximum possible number of
> APIC IDs.
>
> In particular, CPUID 0x80000008:ecx reports 0x0000601f. That is, an
> APIC ID shift of 6 (reporting a maximum of 64 threads per socket), and
> NC as 31 (reporting 32 threads per socket in the current configuration).
>
> c->x86_max_cores is derived from NC and shifted once to exclude threads,
> giving it a final value of 16 cores per socket.

So far so good.

> Given the sparseness of the APIC IDs, it is unsafe to allocate an array

Do we do this somewhere or is this a hypothetical thing?

> of c->x86_max_cores entries, then index it with c->cpu_core_id, as half
> the cores in the system have a cpu_core_id greater than x86_max_cores.Â

You lost me here. ->cpu_core_id comes from CPUID_Fn8000001E_EBX[7:0].
Are you saying, those core IDs on your box are sparse like the APIC IDs
you mention above?

> There is no logical core ID derived during boot which might be a safe to
> use as an index.
>
> Furthermore, the documentation indicates that these values are expected
> to be per-package, while they are all actually per-socket (with up to 4
> nodes per socket) in the EPYC case.

>From Documentation/x86/topology.txt:
"
- cpuinfo_x86.x86_max_cores:

The number of cores in a package. This information is retrieved via CPUID."

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.