Re: [PATCH 1/2] perf/core: Adding capability to disable PMUs event multiplexing

From: Ganapatrao Kulkarni
Date: Wed Nov 06 2019 - 18:29:01 EST


Hi Peter, Mark,

On Wed, Nov 6, 2019 at 3:28 AM Mark Rutland <mark.rutland@xxxxxxx> wrote:
>
> On Wed, Nov 06, 2019 at 01:01:40AM +0000, Ganapatrao Prabhakerrao Kulkarni wrote:
> > When PMUs are registered, perf core enables event multiplexing
> > support by default. There is no provision for PMUs to disable
> > event multiplexing, if PMUs want to disable due to unavoidable
> > circumstances like hardware errata etc.
> >
> > Adding PMU capability flag PERF_PMU_CAP_NO_MUX_EVENTS and support
> > to allow PMUs to explicitly disable event multiplexing.
>
> Even without multiplexing, this PMU activity can happen when switching
> tasks, or when creating/destroying events, so as-is I don't think this
> makes much sense.
>
> If there's an erratum whereby heavy access to the PMU can lockup the
> core, and it's possible to workaround that by minimzing accesses, that
> should be done in the back-end PMU driver.

As said in errata, If there are heavy access to memory like stream
application running and along with that if PMU control registers are
also accessed frequently, then CPU lockup is seen.

I ran perf stat with 4 events of thuderx2 PMU as well as with 6 events
for stream application.
For 4 events run, there is no event multiplexing, where as for 6
events run the events are multiplexed.

For 4 event run:
No of times pmu->add is called: 10
No of times pmu->del is called: 10
No of times pmu->read is called: 310

For 6 events run:
No of times pmu->add is called: 5216
No of times pmu->del is called: 5216
No of times pmu->read is called: 5216

Issue happens when the add and del are called too many times as seen
with 6 event case.
The PMU hardware control registers are programmed when add and del
functions are called.
For pmu->read no issues since no h/w issue with the data path.

This is UNCORE driver, not sure context switch has any influence on this?
Please suggest me, how can we fix this in back-end PMU driver without
any perf core help?

>
> Either way, this minimzes the utility of the PMU.
>
> Thanks,
> Mark.
>
> >
> > Signed-off-by: Ganapatrao Prabhakerrao Kulkarni <gkulkarni@xxxxxxxxxxx>
> > ---
> > include/linux/perf_event.h | 1 +
> > kernel/events/core.c | 8 ++++++++
> > 2 files changed, 9 insertions(+)
> >
> > diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> > index 61448c19a132..9e18d841daf7 100644
> > --- a/include/linux/perf_event.h
> > +++ b/include/linux/perf_event.h
> > @@ -247,6 +247,7 @@ struct perf_event;
> > #define PERF_PMU_CAP_HETEROGENEOUS_CPUS 0x40
> > #define PERF_PMU_CAP_NO_EXCLUDE 0x80
> > #define PERF_PMU_CAP_AUX_OUTPUT 0x100
> > +#define PERF_PMU_CAP_NO_MUX_EVENTS 0x200
> >
> > /**
> > * struct pmu - generic performance monitoring unit
> > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > index 4655adbbae10..65452784f81c 100644
> > --- a/kernel/events/core.c
> > +++ b/kernel/events/core.c
> > @@ -1092,6 +1092,10 @@ static void __perf_mux_hrtimer_init(struct perf_cpu_context *cpuctx, int cpu)
> > if (pmu->task_ctx_nr == perf_sw_context)
> > return;
> >
> > + /* No PMU support */
> > + if (pmu->capabilities & PERF_PMU_CAP_NO_MUX_EVENTS)
> > + return 0;
> > +
> > /*
> > * check default is sane, if not set then force to
> > * default interval (1/tick)
> > @@ -1117,6 +1121,10 @@ static int perf_mux_hrtimer_restart(struct perf_cpu_context *cpuctx)
> > if (pmu->task_ctx_nr == perf_sw_context)
> > return 0;
> >
> > + /* No PMU support */
> > + if (pmu->capabilities & PERF_PMU_CAP_NO_MUX_EVENTS)
> > + return 0;
> > +
> > raw_spin_lock_irqsave(&cpuctx->hrtimer_lock, flags);
> > if (!cpuctx->hrtimer_active) {
> > cpuctx->hrtimer_active = 1;
> > --
> > 2.17.1
> >

Thanks,
Ganapat