Re: [RFCv1 4/4] perf: arm_spe: Dynamically switch PID tracing to contextidr

From: Catalin Marinas
Date: Tue Dec 07 2021 - 06:48:13 EST


On Sun, Dec 05, 2021 at 09:51:03PM +0800, Leo Yan wrote:
> On Fri, Dec 03, 2021 at 04:22:42PM +0000, Catalin Marinas wrote:
> > What's the cost of always enabling CONFIG_PID_IN_CONTEXTIDR? If it's
> > negligible, I'd not bother at all with any of the enabling/disabling.
>
> Yes, I compared performance for PID tracing with always enabling and
> disabling CONFIG_PID_IN_CONTEXTIDR, and also compared with using
> static key for enabling/disabling PID tracing. The result shows the
> cost is negligible based on the benchmark 'perf bench sched'.
>
> Please see the detailed data in below link (note the testing results
> came from my Juno board):
> https://lore.kernel.org/lkml/20211021134530.206216-1-leo.yan@xxxxxxxxxx/

The table wasn't entirely clear to me. So the dis/enb benchmarks are
without this patchset applied. There seems to be a minor drop but it's
probably noise. Anyway, do we need this patchset or we just make
CONFIG_PID_IN_CONTEXTIDR default to y?

> > Another question: can you run multiple instances of SPE for different
> > threads on different CPUs? What happens to the global contextidr_in_use
> > key when one of them stops?
>
> No, I only can launch one instance for Arm SPE event via perf tool; when
> I tried to launch a second instance, perf tool reports failure:
>
> The sys_perf_event_open() syscall returned with 16 (Device or resource
> busy) for event (arm_spe_0/load_filter=1,store_filter=1/u).
[...]
> Alternatively, I'd like give several examples for contextidr_in_use key
> values when run different perf modes.
[...]
> Hope these three cases can demonstrate the usage for contextidr_in_use
> static key.

OK, so we can have multiple uses of PID in CONTEXTIDR. Since
static_branch_inc() is refcounted, we get away with this but the
downside is that a CPU won't notice until its next thread switch.

--
Catalin