Re: [RESEND PATCH v2] arm64: errata: add detection for AMEVCNTR01 incrementing incorrectly

From: Catalin Marinas
Date: Wed Aug 17 2022 - 12:59:15 EST


Hi Ionela,

On Wed, Aug 17, 2022 at 01:15:51PM +0100, Ionela Voinescu wrote:
> diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
> index 7e6289e709fc..810dd3c39882 100644
> --- a/arch/arm64/kernel/cpu_errata.c
> +++ b/arch/arm64/kernel/cpu_errata.c
> @@ -654,6 +654,16 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
> ERRATA_MIDR_REV_RANGE(MIDR_CORTEX_A510, 0, 0, 2)
> },
> #endif
> +#ifdef CONFIG_ARM64_ERRATUM_2457168
> + {
> + .desc = "ARM erratum 2457168",
> + .capability = ARM64_WORKAROUND_2457168,
> + .type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE,
> +
> + /* Cortex-A510 r0p0-r1p1 */
> + CAP_MIDR_RANGE(MIDR_CORTEX_A510, 0, 0, 1, 1)
> + },
> +#endif
> #ifdef CONFIG_ARM64_ERRATUM_2038923
> {
> .desc = "ARM erratum 2038923",
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index 907401e4fffb..af4de817d712 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -1870,7 +1870,10 @@ static void cpu_amu_enable(struct arm64_cpu_capabilities const *cap)
> pr_info("detected CPU%d: Activity Monitors Unit (AMU)\n",
> smp_processor_id());
> cpumask_set_cpu(smp_processor_id(), &amu_cpus);
> - update_freq_counters_refs();
> +
> + /* 0 reference values signal broken/disabled counters */
> + if (!this_cpu_has_cap(ARM64_WORKAROUND_2457168))
> + update_freq_counters_refs();
> }
> }

>From a CPU errata workaround, this part looks fine to me.

>
> diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
> index 869ffc4d4484..5d7efb15f7cf 100644
> --- a/arch/arm64/kernel/topology.c
> +++ b/arch/arm64/kernel/topology.c
> @@ -301,7 +301,8 @@ static void cpu_read_corecnt(void *val)
>
> static void cpu_read_constcnt(void *val)
> {
> - *(u64 *)val = read_constcnt();
> + *(u64 *)val = this_cpu_has_cap(ARM64_WORKAROUND_2457168) ?
> + 0UL : read_constcnt();
> }
>
> static inline
> @@ -328,7 +329,12 @@ int counters_read_on_cpu(int cpu, smp_call_func_t func, u64 *val)
> */
> bool cpc_ffh_supported(void)
> {
> - return freq_counters_valid(get_cpu_with_amu_feat());
> + int cpu = get_cpu_with_amu_feat();
> +
> + if ((cpu >= nr_cpu_ids) || !cpumask_test_cpu(cpu, cpu_present_mask))
> + return false;
> +
> + return true;
> }

So here we tell the core code that FFH is supported but always return 0
via cpc_read_ffh() if the const counter is requested. I assume the core
code figures this out and doesn't use the value on the affected CPUs. I
was hoping cpc_ffh_supported() would be per-CPU and the core code simply
skips calling cpc_read() on the broken cores. Is the other register read
by cpc_read_ffh() still useful without the const one?

While the Kconfig entry describes the behaviour, I'd rather have a
comment in cpc_ffh_supported() and maybe cpu_read_constcnt() on why we
do these tricks.

Thanks.

--
Catalin