Re: [PATCH] arm64: cpufeature: use static_branch_enable_cpuslocked() (was: Re: [patch V2 00/24] cpu/hotplug: Convert get_online_cpus() to a percpu_rwsem)

From: Suzuki K Poulose
Date: Thu Apr 27 2017 - 12:36:11 EST


On Thu, Apr 27, 2017 at 04:48:06PM +0100, Mark Rutland wrote:
> Hi Catalin/Will,
>
> The below addresses a boot failure Catalin spotted in next-20170424,
> based on Sebastian's patch [1]. I've given it a spin on Juno R1, where I
> can reproduce the issue prior to applying this patch.
>
> I believe this would need to go via tip, as the issue is a result of
> change in the tip smp/hotplug branch, and the fix depends on
> infrastructure introduced there.
>
> Are you happy with the fix, and for it to go via the tip tree?
>
> Thanks,
> Mark.
>
> [1] https://lkml.kernel.org/r/20170425172838.mr3kyccsdteyjso5@xxxxxxxxxxxxx
> [2] https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/log/?h=smp/hotplug
>
> ---->8----
> From 6cdb503b060f74743769c9f601c35f985d3c58eb Mon Sep 17 00:00:00 2001
> From: Mark Rutland <mark.rutland@xxxxxxx>
> Date: Wed, 26 Apr 2017 09:46:47 +0100
> Subject: [PATCH] arm64: cpufeature: use static_branch_enable_cpuslocked()
>
> Recently, the hotplug locking was conveted to use a percpu rwsem. Unlike
> the existing {get,put}_online_cpus() logic, this can't nest.
> Unfortunately, in arm64's secondary boot path we can end up nesting via
> static_branch_enable() in cpus_set_cap() when we detect an erratum.
>
> This leads to a stream of messages as below, where the secondary
> attempts to schedule before it has been fully onlined. As the CPU
> orchestrating the onlining holds the rswem, this hangs the system.
>
> [ 0.250334] BUG: scheduling while atomic: swapper/1/0/0x00000002
> [ 0.250337] Modules linked in:
> [ 0.250346] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.11.0-rc7-next-20170424 #2
> [ 0.250349] Hardware name: ARM Juno development board (r1) (DT)
> [ 0.250353] Call trace:
> [ 0.250365] [<ffff000008088510>] dump_backtrace+0x0/0x238
> [ 0.250371] [<ffff00000808880c>] show_stack+0x14/0x20
> [ 0.250377] [<ffff00000839d854>] dump_stack+0x9c/0xc0
> [ 0.250384] [<ffff0000080e3540>] __schedule_bug+0x50/0x70
> [ 0.250391] [<ffff000008932ecc>] __schedule+0x52c/0x5a8
> [ 0.250395] [<ffff000008932f80>] schedule+0x38/0xa0
> [ 0.250400] [<ffff000008935e8c>] rwsem_down_read_failed+0xc4/0x108
> [ 0.250407] [<ffff0000080fe8e0>] __percpu_down_read+0x100/0x118
> [ 0.250414] [<ffff0000080c0b60>] get_online_cpus+0x70/0x78
> [ 0.250420] [<ffff0000081749e8>] static_key_enable+0x28/0x48
> [ 0.250425] [<ffff00000808de90>] update_cpu_capabilities+0x78/0xf8
> [ 0.250430] [<ffff00000808d14c>] update_cpu_errata_workarounds+0x1c/0x28
> [ 0.250435] [<ffff00000808e004>] check_local_cpu_capabilities+0xf4/0x128
> [ 0.250440] [<ffff00000808e894>] secondary_start_kernel+0x8c/0x118
> [ 0.250444] [<000000008093d1b4>] 0x8093d1b4
>
> We call cpus_set_cap() from update_cpu_capabilities(), which is called
> from the secondary boot path (where the CPU orchestrating the onlining
> holds the hotplug rwsem), and in the primary boot path, where this is
> not held.
>
> This patch makes cpus_set_cap() use static_branch_enable_cpuslocked(),
> and updates the primary CPU boot path to hold the rwsem so as to keep
> the *_cpuslocked() code happy.
>
> Signed-off-by: Mark Rutland <mark.rutland@xxxxxxx>
> Reported-by: Catalin Marinas <catalin.marinas@xxxxxxx>
> Suggested-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>
> Suggested-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: Will Deacon <will.deacon@xxxxxxx>
> Cc: Suzuki Poulose <suzuki,poulose@xxxxxxx>
> Signed-off-by: Mark Rutland <mark.rutland@xxxxxxx>
> ---
> arch/arm64/include/asm/cpufeature.h | 2 +-
> arch/arm64/kernel/smp.c | 8 +++++---
> 2 files changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
> index f31c48d..349b5cd 100644
> --- a/arch/arm64/include/asm/cpufeature.h
> +++ b/arch/arm64/include/asm/cpufeature.h
> @@ -145,7 +145,7 @@ static inline void cpus_set_cap(unsigned int num)
> num, ARM64_NCAPS);
> } else {
> __set_bit(num, cpu_hwcaps);
> - static_branch_enable(&cpu_hwcap_keys[num]);
> + static_branch_enable_cpuslocked(&cpu_hwcap_keys[num]);
> }
> }
>
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index 9b10365..c2ce9aa 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -447,11 +447,13 @@ void __init smp_prepare_boot_cpu(void)
> cpuinfo_store_boot_cpu();
> save_boot_cpu_run_el();
> /*
> - * Run the errata work around checks on the boot CPU, once we have
> - * initialised the cpu feature infrastructure from
> - * cpuinfo_store_boot_cpu() above.
> + * Run the errata work around checks on the boot CPU, now that
> + * cpuinfo_store_boot_cpu() has set things up. We hold the percpu rwsem
> + * to keep the workaround setup code happy.
> */
> + get_online_cpus();
> update_cpu_errata_workarounds();
> + put_online_cpus();

We need similar locking for updat_cpu_capabilities() called
from setup_cpu_feature_capabilities(). i.e., the following fixup is required.

diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 94b8f7f..62bdab4 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -966,6 +966,7 @@ static void __init setup_elf_hwcaps(const struct arm64_cpu_capabilities *hwcaps)
cap_set_elf_hwcap(hwcaps);
}

+/* Should be called with CPU hotplug lock held */
void update_cpu_capabilities(const struct arm64_cpu_capabilities *caps,
const char *info)
{
@@ -1092,7 +1093,9 @@ void check_local_cpu_capabilities(void)

static void __init setup_feature_capabilities(void)
{
- update_cpu_capabilities(arm64_features, "detected feature:");
+ get_online_cpus();
+ update_cpu_capabilities(arm6_features, "detected feature:");
+ put_online_cpus();
enable_cpu_capabilities(arm64_features);
}

--
---

Also, I think having update_cpu_errata_workarounds() called with and without
the CPU hotplug lock makes it a bit confusing without proper explanation.
How about the following patch which makes things a bit more reader friendly ?

---8>---