Re: [PATCH v9 0/8] Parallel CPU bringup for x86_64

From: David Woodhouse
Date: Tue Feb 21 2023 - 02:17:45 EST




On 21 February 2023 04:20:41 GMT, Kim Phillips <kim.phillips@xxxxxxx> wrote:
>On 2/20/23 5:30 PM, David Woodhouse wrote:
>> On Mon, 2023-02-20 at 17:23 -0600, Kim Phillips wrote:
>>> On 2/20/23 3:39 PM, David Woodhouse wrote:
>>>> On 20 February 2023 21:23:38 GMT, Oleksandr Natalenko <oleksandr@xxxxxxxxxxxxxx> wrote:
>>>>> On 20.02.2023 21:31, David Woodhouse wrote:
>>>>>> On Mon, 2023-02-20 at 17:40 +0100, Oleksandr Natalenko wrote:
>>>>>>> On pondělí 20. února 2023 17:20:13 CET David Woodhouse wrote:
>>>>>>>> On Mon, 2023-02-20 at 17:08 +0100, Oleksandr Natalenko wrote:
>>>>>>>>>
>>>>>>>>> I've applied this to the v6.2 kernel, and suspend/resume broke on
>>>>>>>>> my
>>>>>>>>> Ryzen 5950X desktop. The machine suspends just fine, but on
>>>>>>>>> resume
>>>>>>>>> the screen stays blank, and there's no visible disk I/O.
>>>>>>>>>
>>>>>>>>> Reverting the series brings suspend/resume back to working state.
>>>>>>>>
>>>>>>>> Hm, thanks. What if you add 'no_parallel_bringup' on the command
>>>>>>>> line?
>>>>>>>
>>>>>>> If the `no_parallel_bringup` param is added, the suspend/resume
>>>>>>> works.
>>>>>>
>>>>>> Thanks for the testing. Can I ask you to do one further test: apply the
>>>>>> series only as far as patch 6/8 'x86/smpboot: Support parallel startup
>>>>>> of secondary CPUs'.
>>>>>>
>>>>>> That will do the new startup asm sequence where each CPU finds its own
>>>>>> per-cpu data so it *could* work in parallel, but doesn't actually do
>>>>>> the bringup in parallel yet.
>>>>>
>>>>> With patches 1 to 6 (including) applied and no extra cmdline
>>>>> params added the resume doesn't work.
>>>>
>>>> Hm. Kim, is there some weirdness with the way AMD CPUs get their
>>>> APIC ID in CPUID 0x1? Especially after resume?
>>>
>>> Not to my knowledge.  Mario?
>
>I tested v9-up-to-6/8 on a Ryzen 3000 that passed your between-v6 & v7
>tree commits (ce7e2d1e046a for the parallel-6.2-rc6-part1 tag
>and 17bbd12ee03 for parallel-6.2-rc6), and it, too, fails to resume
>v9-up-to-6/8 after suspend.
>
>> Oleksandr, please could you show the output of 'cpuid' after a
>> successful resume? I'm particularly looking for this part...
>>
>>
>> $ sudo cpuid | grep -A1 1/ebx
>> miscellaneous (1/ebx):
>> process local APIC physical ID = 0x0 (0)
>> --
>> miscellaneous (1/ebx):
>> process local APIC physical ID = 0x2 (2)
>> ...
>
>The Ryzens have a different pattern it seems:
>
>$ sudo cpuid | grep -A1 \(1/ebx
> miscellaneous (1/ebx):
> process local APIC physical ID = 0x0 (0)
>--
> miscellaneous (1/ebx):
> process local APIC physical ID = 0x1 (1)
>--
> miscellaneous (1/ebx):
> process local APIC physical ID = 0x2 (2)
>--
> miscellaneous (1/ebx):
> process local APIC physical ID = 0x3 (3)
>--
> miscellaneous (1/ebx):
> process local APIC physical ID = 0x4 (4)
>--
> miscellaneous (1/ebx):
> process local APIC physical ID = 0x5 (5)
>--
> miscellaneous (1/ebx):
> process local APIC physical ID = 0x6 (6)
>--
> miscellaneous (1/ebx):
> process local APIC physical ID = 0x7 (7)
>
>
>I tested the v7 series on Ryzen, it also fails, so
>Ryzen users were last known good with those two
>aforementioned commits on your tree:
>
>git://git.infradead.org/users/dwmw2/linux.git

That was when it was only using (and validating) CPUID 0xB and never trusting CPUID 0x1, right?