Re: [PATCH] powerpc/perf: Fix for core/nest imc call trace on cpuhotplug

From: Santosh Sivaraj
Date: Thu Oct 05 2017 - 05:50:25 EST


* Anju T Sudhakar <anju@xxxxxxxxxxxxxxxxxx> wrote (on 2017-10-04 06:50:52 +0000):

> Nest/core pmu units are enabled only when it is used. A reference count is
> maintained for the events which uses the nest/core pmu units. Currently in
> *_imc_counters_release function a WARN() is used for notification of any
> underflow of ref count.
>
> The case where event ref count hit a negative value is, when perf session is
> started, followed by offlining of all cpus in a given core.
> i.e. in cpuhotplug offline path ppc_core_imc_cpu_offline() function set the
> ref->count to zero, if the current cpu which is about to offline is the last
> cpu in a given core and make an OPAL call to disable the engine in that core.
> And on perf session termination, perf->destroy (core_imc_counters_release) will
> first decrement the ref->count for this core and based on the ref->count value
> an opal call is made to disable the core-imc engine.
> Now, since cpuhotplug path already clears the ref->count for core and disabled
> the engine, perf->destroy() decrementing again at event termination make it
> negative which in turn fires the WARN_ON. The same happens for nest units.
>
> Add a check to see if the reference count is alreday zero, before decrementing
> the count, so that the ref count will not hit a negative value.
>
> Signed-off-by: Anju T Sudhakar <anju@xxxxxxxxxxxxxxxxxx>

Reviewed-by: Santosh Sivaraj <santosh@xxxxxxxxxx>
> ---
> arch/powerpc/perf/imc-pmu.c | 28 ++++++++++++++++++++++++++++
> 1 file changed, 28 insertions(+)
>
> diff --git a/arch/powerpc/perf/imc-pmu.c b/arch/powerpc/perf/imc-pmu.c
> index 9ccac86f3463..e3a1f65933b5 100644
> --- a/arch/powerpc/perf/imc-pmu.c
> +++ b/arch/powerpc/perf/imc-pmu.c
> @@ -399,6 +399,20 @@ static void nest_imc_counters_release(struct perf_event *event)
>
> /* Take the mutex lock for this node and then decrement the reference count */
> mutex_lock(&ref->lock);
> + if (ref->refc == 0) {
> + /*
> + * The scenario where this is true is, when perf session is
> + * started, followed by offlining of all cpus in a given node.
> + *
> + * In the cpuhotplug offline path, ppc_nest_imc_cpu_offline()
> + * function set the ref->count to zero, if the cpu which is
> + * about to offline is the last cpu in a given node and make
> + * an OPAL call to disable the engine in that node.
> + *
> + */
> + mutex_unlock(&ref->lock);
> + return;
> + }
> ref->refc--;
> if (ref->refc == 0) {
> rc = opal_imc_counters_stop(OPAL_IMC_COUNTERS_NEST,
> @@ -646,6 +660,20 @@ static void core_imc_counters_release(struct perf_event *event)
> return;
>
> mutex_lock(&ref->lock);
> + if (ref->refc == 0) {
> + /*
> + * The scenario where this is true is, when perf session is
> + * started, followed by offlining of all cpus in a given core.
> + *
> + * In the cpuhotplug offline path, ppc_core_imc_cpu_offline()
> + * function set the ref->count to zero, if the cpu which is
> + * about to offline is the last cpu in a given core and make
> + * an OPAL call to disable the engine in that core.
> + *
> + */
> + mutex_unlock(&ref->lock);
> + return;
> + }
> ref->refc--;
> if (ref->refc == 0) {
> rc = opal_imc_counters_stop(OPAL_IMC_COUNTERS_CORE,

--