Re: [PATCH v18 5/7] kexec: exclude hot remove cpu from elfcorehdr notes

From: Thomas Gleixner
Date: Wed Feb 08 2023 - 08:45:37 EST


Eric!

On Tue, Feb 07 2023 at 11:23, Eric DeVolder wrote:
> On 2/1/23 05:33, Thomas Gleixner wrote:
>
> So my latest solution is introduce two new CPUHP states, CPUHP_AP_ELFCOREHDR_ONLINE
> for onlining and CPUHP_BP_ELFCOREHDR_OFFLINE for offlining. I'm open to better names.
>
> The CPUHP_AP_ELFCOREHDR_ONLINE needs to be placed after CPUHP_BRINGUP_CPU. My
> attempts at locating this state failed when inside the STARTING section, so I located
> this just inside the ONLINE sectoin. The crash hotplug handler is registered on
> this state as the callback for the .startup method.
>
> The CPUHP_BP_ELFCOREHDR_OFFLINE needs to be placed before CPUHP_TEARDOWN_CPU, and I
> placed it at the end of the PREPARE section. This crash hotplug handler is also
> registered on this state as the callback for the .teardown method.

TBH, that's still overengineered. Something like this:

bool cpu_is_alive(unsigned int cpu)
{
struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, cpu);

return data_race(st->state) <= CPUHP_AP_IDLE_DEAD;
}

and use this to query the actual state at crash time. That spares all
those callback heuristics.

> I'm making my way though percpu crash_notes, elfcorehdr, vmcoreinfo,
> makedumpfile and (the consumer of it all) the userspace crash utility,
> in order to understand the impact of moving from for_each_present_cpu()
> to for_each_online_cpu().

Is the packing actually worth the trouble? What's the actual win?

Thanks,

tglx