Re: [PATCH v16 3/8] cpu/hotplug: Add dynamic parallel bringup states before CPUHP_BRINGUP_CPU

From: Thomas Gleixner
Date: Thu Mar 23 2023 - 19:48:51 EST


On Thu, Mar 23 2023 at 22:49, David Woodhouse wrote:
> On Thu, 2023-03-23 at 23:36 +0100, Thomas Gleixner wrote:
>> There is no point in special casing this. All architectures can invoke
>> the CPUHP_BP_* states before CPUHP_BRINGUP_CPU for each to be brought up
>> CPU first. So this can be made unconditional and common exercised code.
>>
>
> There were three paragraphs in the commit message explaining why I
> didn't want to do that. It didn't work for x86 before I started, and I
> haven't reviewed *every* other architecture to ensure that it will work
> there. It was opt-in for a reason. :)

I noticed myself before reading your reply :)

>> Aside of that this dynamic state range is pretty pointless. There is
>> really nothing dynamic here and there is no real good reason to have
>> four dynamic parallel states just because.
>
> The original patch set did use more than one state; the plan to do
> microcode updates in parallel does involve doing at least one more, I
> believe.

I don't think so. The micro code muck can completely serialize itself
and does not need control CPU assistance if done right. If we go there
we have to fix that mess and not just proliferatng it with moar duct tape.

>> +       /*
>> +        * Fully per AP serialized bringup from here on. If the
>> +        * architecture does no register the CPUHP_BP_PARALLEL_STARTUP
>> +        * state, this step sends the startup IPI first.
>> +        */
>
> Not sure I'd conceded that yet; the APs do their *own* bringup from
> here, and that really ought to be able to run in parallel.

Somewhere in the distance future once we've

1) sorted the mandatory synchronization points, e.g. TSC sync in the
early bootup phase.

2) audited _all_ AP state callbacks that they are able to cope with
parallel bringup.

That's a long road as there are tons of assumptions about the
implicit CPU hotplug serialization in those callbacks.

Just because it did not explode in your face yet does not mean this
is safe.

I just looked at 10 randomly picked AP online callbacks and found 5
of them being not ready :)

Thanks,

tglx