Re: [patch 5/9] x86: Cure per CPU madness on UP
From: Thomas Gleixner
Date: Fri Mar 15 2024 - 13:40:52 EST
On Fri, Mar 15 2024 at 09:42, Linus Torvalds wrote:
> On Fri, 15 Mar 2024 at 09:17, Guenter Roeck <linux@xxxxxxxxxxxx> wrote:
> Without line numbers (if you have debug info for that kernel, it's
> good to run "scripts/decode_stacktrace.sh" on stack traces) it's hard
> to really know what's up, but I strongly suspect that it's this:
>
> rapl_pmus->pmus[topology_logical_die_id(cpu)] = pmu;
>
> because we have
>
> topology_logical_die_id(cpu) ->
> (cpu_data(cpu).topo.logical_die_id)
>
> and we have
>
> c->topo.logical_die_id = topology_get_logical_id(apicid, TOPO_DIE_DOMAIN);
>
> and topology_get_logical_id() does this:
>
> if (lvlid >= MAX_LOCAL_APIC)
> return -ERANGE;
> if (!test_bit(lvlid, apic_maps[at_level].map))
> return -ENODEV;
>
> so that -ENODEV is not entirely unlikely for a UP run.
>
> This also explains why it *used* to work - that whole thing is new to
> the current merge window and came in through commit ca7e91776912
> ("Merge tag 'x86-apic-2024-03-10' of ...").
>
> Thomas, over to you. I wonder if maybe all those topology macros
> should just return 0 on an UP build, but that
> topology_get_logical_id() thing looks a bit wrong regardless.
>
> It really shouldn't depend on local apic data for configs that may not
> *have* a local apic.
Right. Let me look.