Re: [RFC PATCH V2 0/1] x86: cpu topology fix and question on x86_max_cores

From: Peter Zijlstra
Date: Mon Feb 20 2023 - 06:09:07 EST


On Mon, Feb 20, 2023 at 11:28:55AM +0800, Zhang Rui wrote:

> Questions on how to fix cpuinfo_x86.x86_max_cores
> -------------------------------------------------
>
> Fixing x86_max_cores is more complex. Current kernel uses below logic to
> get x86_max_cores
> x86_max_cores = cpus_in_a_package / smp_num_siblings
> But
> 1. There is a known bug in CPUID.1F handling code. Thus cpus_in_a_package
> can be bogus. To fix it, I will add CPUID.1F Module level support.
> 2. x86_max_cores is set and used in an inconsistent way in current kernel.
> In short, smp_num_siblings/x86_max_cores
> 2.1 represents the number of maximum *addressable* threads/cores in a
> core/package when retrieved via CPUID 1 and 4 on old platforms.
> CPUID.1 EBX 23:16 "Maximum number of addressable IDs for logical
> processors in this physical package".
> CPUID.4 EAX 31:26 "Maximum number of addressable IDs for processor
> cores in the physical package".
> 2.2 represents the number of maximum *possible* threads/cores in a
> core/package, when retrieved via CPUID.B/1F on non-Hybrid platforms.
> CPUID.B/1F EBX 15:0 "Number of logical processors at this level type.
> The number reflects configuration as shipped by Intel".
> For example, in calc_llc_size_per_core()
> do_div(llc_size, c->x86_max_cores);
> x86_max_cores is used as the max *possible* cores in a package.
> 2.3 is used in a conflict way on other vendors like AMD by checking the
> code. I need help on confirming the proper behavior for AMD.
> For example, in amd_get_topology(),
> c->x86_coreid_bits = get_count_order(c->x86_max_cores);
> x86_max_cores is used as the max *addressable* cores in a package.
> in get_nbc_for_node(),
> cores_per_node = (c->x86_max_cores * smp_num_siblings) / amd_get_nodes_per_socket();
> x86_max_cores is used as the max *possible* cores in a package.
> 3. using
> x86_max_cores = cpus_in_a_package / smp_num_siblings
> to get the number of maximum *possible* cores in a package during boot
> cpu bringup is not applicable on platforms with asymmetric cores.
> Because, for a given number of threads, we don't know how many of the
> threads are the master thread or the only thread of a core, and how
> many of them are SMT siblings.
> For example, on a platform with 6 Pcore and 8 Ecore, there are 20
> threads. But setting x86_max_cores to 10 is apparently wrong.
>
> Given the above situation, I have below question and any input is really
> appreciated.
>
> Is this inconsistency a problem or not?

IIRC x86_max_cores in specific is only ever used in arch specific code,
the pmu uncore drivers and things like that (grep shows MCE).

Also, perhaps you want to look at calculate_max_logical_packages(). That
has a comment about there not being heterogeneous systems :/

Anyway, the reason I went and had a look there, is because I remember
Thomas and me spend entirely too much time to try and figure out means
to size an array for number of pacakges at boot time and getting it
wrong too many times to recount.

If only there was a sane way to tell these things without actually
bringing everything online first :-(