Re: [PATCH] perf vendor events arm64: Add hip08 implementation defined PMU core events

From: Will Deacon
Date: Wed Oct 18 2017 - 13:04:05 EST


Hi John,

On Wed, Oct 18, 2017 at 05:37:03PM +0100, John Garry wrote:
> On 18/10/2017 11:49, Will Deacon wrote:
> >On Wed, Oct 18, 2017 at 10:25:39AM +0100, John Garry wrote:
> >>On 17/10/2017 13:59, Will Deacon wrote:
> >>>On Tue, Oct 17, 2017 at 03:01:39PM +0800, Shaokun Zhang wrote:
> >>>>diff --git a/tools/perf/pmu-events/arch/arm64/hisilicon/hip08-imp-def.json b/tools/perf/pmu-events/arch/arm64/hisilicon/hip08-imp-def.json
> >>>>new file mode 100644
> >>>>index 0000000..6bb31da
> >>>>--- /dev/null
> >>>>+++ b/tools/perf/pmu-events/arch/arm64/hisilicon/hip08-imp-def.json
> >>>>@@ -0,0 +1,176 @@
> >>>>+[
> >>>>+ {
> >>>>+ "PublicDescription": "Attributable Level 1 data cache access, read",
> >>>>+ "EventCode": "0x40",
> >>>>+ "EventName": "L1D_CACHE_RD",
> >>>>+ "BriefDescription": "L1D cache access, read",
> >>>>+ },
> >>>>+ {
> >>>>+ "PublicDescription": "Attributable Level 1 data cache access, write",
> >>>>+ "EventCode": "0x41",
> >>>>+ "EventName": "L1D_CACHE_WR",
> >>>>+ "BriefDescription": "L1D cache access, write",
> >>>>+ },
> >>>
> >>>So these are the same as the events in cavium/thunderx2-imp-def.json and
> >>>should be factored out. In fact, ARM recommends event numbers for 0x40-0xBF,
> >>>so the best thing would be to have those defined in their own file, then
> >>>have a way for the various CPU-specific .json files to pick and chose the
> >>>events they need from there.
> >>
> >>Right, this seems reasonable. Just need to check on feasible.
> >>
> >>In terms of coordinating this work, shall we do it? Will arm64+ThunderX
> >>support be accepted as is?
> >
> >Yes, that would be my preference if you don't mind.
>
>
> Cool.
>
> I am looking at this topic now. But I am doubting the folder structure
> again.
>
> Firstly, we still have this comment in the README:
> All the topic JSON files for a CPU model/family should be in a separate
> sub directory.
>
> Now, when thunderx3 or hip09 comes along, I assume that their jsons will
> similarly go into cavium and hisilcon folders, respectively. So, for
> example, we add thunderx3 json, like this:
> arm64/cavium/thunderx3-imp-def.json
> [
> {
> "PublicDescription": "foo",
> "EventCode": "0x40",
> "EventName": "bar",
> "BriefDescription": "sieve",
> },
> ]

Hang on, event 0x40 is one of the common ones, so that should be somewhere
else like arm64/armv8-common.json.

> #Family-model,Version,Filename,EventType
> 0x00000000420f5160,v1,cavium,core
> 0x00000000420f5161,v1,cavium,core
>
> Then we have generated pmu_events.c, like this:
> #include "../../pmu-events/pmu-events.h"
> struct pmu_event pme_cavium[] = {
> {
> .name = "bar",
> .event = "event=0x40",
> .desc = "sieve",
> .topic = "thunderx3 imp def",
> .long_desc = "foo",
> },
> {
> .name = "l1d_cache_rd",
> .event = "event=0x40",
> .desc = "L1D cache read",
> .topic = "thunderx2 imp def",
> .long_desc = "Attributable Level 1 data cache access, read",
> },
> {
> .name = "l1d_cache_wr",
> .event = "event=0x41",
> .desc = "L1D cache write",
> .topic = "thunderx2 imp def",
> .long_desc = "Attributable Level 1 data cache access, write ",
> },
>
> [ ... ]
>
> {
> .name = 0,
> .event = 0,
> .desc = 0,
> },
> };
> struct pmu_events_map pmu_events_map[] = {
> {
> .cpuid = "0x00000000420f5160",
> .version = "v1",
> .type = "core",
> .table = pme_cavium
> },
> {
> .cpuid = "0x00000000420f5161",
> .version = "v1",
> .type = "core",
> .table = pme_cavium
> },
> {
> .cpuid = 0,
> .version = 0,
> .type = 0,
> .table = 0,
> },
> };
>
> It doesn't look right, espcially since we have conflicting definitions for
> event 0x40. We really should have table per cpu.

There should be one definition of event 0x40 and it should use the wording
from the ARM ARM:

0x0040 , L1D_CACHE_RD, Attributable Level 1 data cache access, read
This event is similar to Level 1 data cache access, L1D_CACHE,
but the counter counts only memory-read operations that access
at least the Level 1 data or unified cache.

I think that the tricky bit is working out which subset of the
armv8-common.json file applies to a given CPU, but I was hoping you'd have
some ideas about that, even if it's an additional build step when building
perf to generate the final JSON files

Will