Re: [PATCH RESEND 1/1] x86/smpboot: check cpu_initialized_mask first after returning from schedule()

From: Dongli Zhang
Date: Wed Jan 05 2022 - 18:29:14 EST


Hi Tim,

On 1/5/22 11:19 AM, Tim Chen wrote:
> On Thu, 2021-12-23 at 13:03 -0800, Dongli Zhang wrote:
>> To online a new CPU, the master CPU signals the secondary and waits
>> for at
>> most 10 seconds until cpu_initialized_mask is set by the secondary
>> CPU.
>> There is a case that the master CPU fails the online when it takes
>> 10+
>> seconds for schedule() to return (although the cpu_initialized_mask
>> is
>> already set by the secondary CPU).
>>
>> 1. The master CPU signals the secondary CPU in do_boot_cpu().
>>
>> 2. As the cpu_initialized_mask is still not set, the master CPU
>> reschedules
>> via schedule().
>>
>> 3. Suppose it takes 10+ seconds until schedule() returns due to
>> performance
>> issue. The secondary CPU sets the cpu_initialized_mask during those
>> 10+
>> seconds.
>
> The patch seems reasonable. But do you know what other task got run
> and caused such long delay (>10 sec), preventing switch
> back to in the master CPU? It seems like there could be some problem
> causing the long delay. It is better if we understand the root cause
> of that.

So far I do not have a consistent repro to verify which kernel function (or
kernel thread) is making the trouble.

However, it is not necessary for other kernel task to take >10 sec to reproduce
the issue, e.g.:

1. The master CPU signals the secondary CPU.
2. Due to the issue at KVM or hardware level (let's assume there is an issue),
the secondary CPU is waken up at the 8th second.
3. The schedule() returns after 3 seconds (due to the 3-second delay by another
kernel task).
4. The 3+8=11 seconds are more than 10 seconds.

Therefore, we should first check whether the secondary CPU is waken up, instead
of timeout. Otherwise, the secondary CPU will be put into an non-recoverable
state, until the OS reboots,

Thank you very much!

Dongli Zhang

>
> Tim
>
>>
>> 4. Once the schedule() at the master CPU returns, although the
>> cpu_initialized_mask is set, the time_before(jiffies, timeout) fails.
>> As a
>> result, the master CPU regards this as failure.
>>
>> [ 51.983296] smpboot: do_boot_cpu failed(-1) to wakeup CPU#4
>>
>> 5. Since the secondary CPU is stuck at state CPU_UP_PREPARE, any
>> further
>> online operation will fail by cpu_check_up_prepare(), unless the
>> below
>> patch set is applied.
>>
>> https://urldefense.com/v3/__https://lore.kernel.org/all/20211206152034.2150770-1-bigeasy@xxxxxxxxxxxxx/__;!!ACWV5N9M2RV99hQ!d7NY8MgLj7W5ZGWS_0HHsvE52WNh2WJbJwLNBJYLGzGIY9BzKg-PUYiIYMmrwud36Ys$
>>
>> This issue is resolved by always first checking whether the secondary
>> CPU
>> has set cpu_initialized_mask.
>>
>> Cc: Longpeng(Mike) <longpeng2@xxxxxxxxxx>
>> Cc: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>
>> Cc: Joe Jin <joe.jin@xxxxxxxxxx>
>> Signed-off-by: Dongli Zhang <dongli.zhang@xxxxxxxxxx>
>> ---
>> To resend by Cc linux-kernel@xxxxxxxxxxxxxxx as well.
>>
>> arch/x86/kernel/smpboot.c | 6 +++++-
>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
>> index 617012f4619f..faad4fcf67eb 100644
>> --- a/arch/x86/kernel/smpboot.c
>> +++ b/arch/x86/kernel/smpboot.c
>> @@ -1145,7 +1145,7 @@ static int do_boot_cpu(int apicid, int cpu,
>> struct task_struct *idle,
>> */
>> boot_error = -1;
>> timeout = jiffies + 10*HZ;
>> - while (time_before(jiffies, timeout)) {
>> + while (true) {
>> if (cpumask_test_cpu(cpu,
>> cpu_initialized_mask)) {
>> /*
>> * Tell AP to proceed with
>> initialization
>> @@ -1154,6 +1154,10 @@ static int do_boot_cpu(int apicid, int cpu,
>> struct task_struct *idle,
>> boot_error = 0;
>> break;
>> }
>> +
>> + if (time_after_eq(jiffies, timeout))
>> + break;
>> +
>> schedule();
>> }
>> }
>