Re: [PATCH] pseries/hotplug: Add more delay in pseries_cpu_die while waiting for rtas-stop

From: Christophe Leroy
Date: Thu Jul 07 2022 - 05:51:37 EST




Le 06/12/2018 à 12:31, Gautham R. Shenoy a écrit :
> From: "Gautham R. Shenoy" <ego@xxxxxxxxxxxxxxxxxx>
>
> Currently running DLPAR offline/online operations in a loop on a
> POWER9 system with SMT=off results in the following crash:
>
> [ 223.321032] cpu 112 (hwid 112) Ready to die...
> [ 223.355963] Querying DEAD? cpu 113 (113) shows 2
> [ 223.356233] cpu 114 (hwid 114) Ready to die...
> [ 223.356236] cpu 113 (hwid 113) Ready to die...
> [ 223.356241] Bad kernel stack pointer 1faf6ca0 at 1faf6d50
> [ 223.356243] Oops: Bad kernel stack pointer, sig: 6 [#1]
> [ 223.356244] LE SMP NR_CPUS=2048 NUMA pSeries
> [ 223.356247] Modules linked in:
> [ 223.356255] CPU: 114 PID: 0 Comm: swapper/114 Not tainted 4.20.0-rc3 #39
> [ 223.356258] NIP: 000000001faf6d50 LR: 000000001faf6d50 CTR: 000000001ec6d06c
> [ 223.356259] REGS: c00000001e5cbd30 TRAP: 0700 Not tainted (4.20.0-rc3)
> [ 223.356260] MSR: 8000000000081000 <SF,ME> CR: 28000004 XER: 00000020
> [ 223.356263] CFAR: 000000001ec98590 IRQMASK: 8000000000009033
> GPR00: 000000001faf6d50 000000001faf6ca0 000000001ed1c448 00000267e6a273c3
> GPR04: 0000000000000000 00000000000000e0 000000000000dfe8 000000001faf6d30
> GPR08: 000000001faf6d28 00000267e6a273c3 000000001ec1b108 0000000000000000
> GPR12: 0000000001b6d998 c00000001eb55080 c0000003a1b8bf90 000000001eea3f40
> GPR16: 0000000000000000 c0000006fda85100 c00000000004c8b0 c0000000013d5300
> GPR20: c0000006fda85300 0000000000000008 c0000000019d2cf8 c0000000013d6888
> GPR24: 0000000000000072 c0000000013d688c 0000000000000002 c0000000013d688c
> GPR28: c0000000019cecf0 0000000000000390 0000000000000000 000000001faf6ca0
> [ 223.356281] NIP [000000001faf6d50] 0x1faf6d50
> [ 223.356281] LR [000000001faf6d50] 0x1faf6d50
> [ 223.356282] Call Trace:
> [ 223.356283] Instruction dump:
> [ 223.356285] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
> [ 223.356286] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
> [ 223.356290] ---[ end trace f46a4e046b564d1f ]---
>
> This is due to multiple offlined CPUs (CPU 113 and CPU 114 above)
> concurrently (within 3us) trying to make the rtas-call with the
> "stop-self" token, something that is prohibited by the PAPR.
>
> The concurrent calls can happen as follows.
>
> o In dlpar_offline_cpu() we prod an offline CPU X (offline due to
> SMT=off) and loop for 25 tries in pseries_cpu_die() querying if
> the target CPU X has been stopped in RTAS. After 25 tries, we
> prints the message "Querying DEAD? cpu X (X) shows 2" and return
> to dlpar_offline_cpu(). Note that at this point CPU X has not yet
> called rtas with the "stop-self" token, but can do so anytime now.
>
> o Back in dlpar_offline_cpu(), we prod the next offline CPU Y. CPU Y
> promptly wakes up and calls RTAS with the "stop-self" token.
>
> o Before RTAS can stop CPU Y, CPU X also calls RTAS with the
> "stop-self" token.
>
> The problem occurs due to the short number of tries (25) in
> pseries_cpu_die() which covers 90% of the cases. For the remaining 10%
> of the cases, it was observed that we would need to loop for a few
> hundred iterations before the target CPU calls rtas with "stop-self"
> token.Experiments show that each try takes roughly ~25us.
>
> In this patch we fix the problem by increasing the number of tries
> from 25 to 4000 (which roughly corresponds to 100ms) before bailing
> out and declaring that we have failed to observe the target CPU call
> rtas-stop-self. This provides sufficient serialization to ensure that
> there no concurrent rtas calls with "stop-self" token.
>
> Reported-by: Michael Bringmann <mwb@xxxxxxxxxxxxxxxxxx>
> Signed-off-by: Gautham R. Shenoy <ego@xxxxxxxxxxxxxxxxxx>

This patch badly conflicts with 801980f64979
("powerpc/pseries/hotplug-cpu: wait indefinitely for vCPU death")

Is it still applicable ? If so, please rebase and resubmit.

Thanks
Christophe

> ---
> arch/powerpc/platforms/pseries/hotplug-cpu.c | 13 ++++++++++++-
> 1 file changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c
> index 2f8e621..c913c44 100644
> --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
> +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
> @@ -214,14 +214,25 @@ static void pseries_cpu_die(unsigned int cpu)
> msleep(1);
> }
> } else if (get_preferred_offline_state(cpu) == CPU_STATE_OFFLINE) {
> + int max_tries = 4000; /* Roughly corresponds to 100ms */
> + u64 tb_before = mftb();
>
> - for (tries = 0; tries < 25; tries++) {
> + for (tries = 0; tries < max_tries; tries++) {
> cpu_status = smp_query_cpu_stopped(pcpu);
> if (cpu_status == QCSS_STOPPED ||
> cpu_status == QCSS_HARDWARE_ERROR)
> break;
> cpu_relax();
> }
> +
> + if (tries == max_tries) {
> + u64 time_elapsed_us = div_u64(mftb() - tb_before,
> + tb_ticks_per_usec);
> +
> + pr_warn("Offlined CPU %d isn't stopped by RTAS after %llu us\n",
> + cpu, time_elapsed_us);
> + WARN_ON(1);
> + }
> }
>
> if (cpu_status != 0) {