Re: [PATCH] arch_topology: support parsing cache topology from DT

From: Dietmar Eggemann
Date: Thu Apr 07 2022 - 16:33:23 EST


On 06/04/2022 11:18, Qing Wang wrote:
> From: wangqing <11112896@xxxxxxxxxx>

[...]

> +void init_cpu_cache_topology(void)
> +{
> + struct device_node *node_cpu, *node_cache;
> + int cpu;
> + int level = 0;
> +
> + for_each_possible_cpu(cpu) {
> + node_cpu = of_get_cpu_node(cpu, NULL);
> + if (!node_cpu)
> + continue;
> +
> + level = 0;
> + node_cache = node_cpu;
> + while (level < MAX_CACHE_LEVEL) {
> + node_cache = of_parse_phandle(node_cache, "next-level-cache", 0);
> + if (!node_cache)
> + break;
> +
> + cache_topology[cpu][level++] = node_cache;
> + }
> + of_node_put(node_cpu);
> + }
> +}

>From where is init_cpu_cache_topology() called?

> +bool cpu_share_llc(int cpu1, int cpu2)
> +{
> + int cache_level;
> +
> + for (cache_level = MAX_CACHE_LEVEL - 1; cache_level > 0; cache_level--) {
> + if (!cache_topology[cpu1][cache_level])
> + continue;
> +
> + if (cache_topology[cpu1][cache_level] == cache_topology[cpu2][cache_level])
> + return true;
> +
> + return false;
> + }
> +
> + return false;
> +}

Like I mentioned in:

https://lkml.kernel.org/r/73b491fe-b5e8-ebca-081e-fa339cc903e1@xxxxxxx

the correct setting in DT's cpu-map node (only core nodes in your case
(One DynamIQ cluster) will give you the correct LLC (highest
SD_SHARE_PKG_RESOURCES) setting.

https://www.kernel.org/doc/Documentation/devicetree/bindings/arm/topology.txt

> +
> +bool cpu_share_l2c(int cpu1, int cpu2)
> +{
> + if (!cache_topology[cpu1][0])
> + return false;
> +
> + if (cache_topology[cpu1][0] == cache_topology[cpu2][0])
> + return true;
> +
> + return false;
> +}
> +
> /*
> * cpu topology table
> */
> @@ -662,7 +720,7 @@ const struct cpumask *cpu_coregroup_mask(int cpu)
> /* not numa in package, lets use the package siblings */
> core_mask = &cpu_topology[cpu].core_sibling;
> }
> - if (cpu_topology[cpu].llc_id != -1) {
> + if (cpu_topology[cpu].llc_id != -1 || cache_topology[cpu][0]) {
> if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask))
> core_mask = &cpu_topology[cpu].llc_sibling;
> }
> @@ -684,7 +742,8 @@ void update_siblings_masks(unsigned int cpuid)
> for_each_online_cpu(cpu) {
> cpu_topo = &cpu_topology[cpu];
>
> - if (cpuid_topo->llc_id == cpu_topo->llc_id) {
> + if ((cpuid_topo->llc_id != -1 && cpuid_topo->llc_id == cpu_topo->llc_id)
> + || (cpuid_topo->llc_id == -1 && cpu_share_llc(cpu, cpuid))) {

Assuming a:

.---------------.
CPU |0 1 2 3 4 5 6 7|
+---------------+
uarch |l l l l m m m b| (so called tri-gear: little, medium, big)
+---------------+
L2 | | | | | | |
+---------------+
L3 |<-- -->|
+---------------+
|<-- cluster -->|
+---------------+
|<-- DSU -->|
'---------------'

system, I guess you would get (w/ Phantom SD and L2/L3 cache info in DT):

CPU0 .. 3:

MC SD_SHARE_PKG_RESOURCES
DIE no SD_SHARE_PKG_RESOURCES

CPU 4...7:

DIE no SD_SHARE_PKG_RESOURCES

I can't see how this would make any sense ...

Reason is cpu_share_llc(). You don't check cache_level=0 and w/

CPU0 .. 3:
cache_topology[CPUX][0] == L2
cache_topology[CPUX][1] == L3

CPU4...7:
cache_topology[CPUX][0] == L3

there is, except for CPU0-1 and CPU2-3, no LLC match.

[...]