Re: [PATCH] perf stat: Fix per socket shadow aggregation

From: Ian Rogers
Date: Tue Dec 07 2021 - 03:49:06 EST


On Mon, Dec 6, 2021 at 2:16 PM Ian Rogers <irogers@xxxxxxxxxx> wrote:
>
> On Mon, Dec 6, 2021 at 2:44 AM James Clark <james.clark@xxxxxxx> wrote:
> >
> >
> >
> > On 04/12/2021 02:34, Ian Rogers wrote:
> > > An uncore device may have a CPU mask that specifies one CPU per socket:
> > > $ cat /sys/devices/uncore_imc_0/cpumask
> > > 0,18
> > > The perf_stat_config aggr_map will map a CPU to the socket and other
> > > aggregation values for it. Fix an error where the index into CPU mask
> > > was being used as the index into the aggr_map. For the cpumask above the
> > > indexes 0 and 1 are passed to aggr_map rather than the CPUs 0 and 18.
> > >
> > > Signed-off-by: Ian Rogers <irogers@xxxxxxxxxx>
> > > ---
> > > tools/perf/util/stat-display.c | 7 ++++---
> > > 1 file changed, 4 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
> > > index 588601000f3f..7cfad5cfec38 100644
> > > --- a/tools/perf/util/stat-display.c
> > > +++ b/tools/perf/util/stat-display.c
> > > @@ -516,7 +516,7 @@ static void printout(struct perf_stat_config *config, struct aggr_cpu_id id, int
> > > static void aggr_update_shadow(struct perf_stat_config *config,
> > > struct evlist *evlist)
> > > {
> > > - int cpu, s;
> > > + int idx, cpu, s;
> > > struct aggr_cpu_id s2, id;
> > > u64 val;
> > > struct evsel *counter;
> > > @@ -525,11 +525,12 @@ static void aggr_update_shadow(struct perf_stat_config *config,
> > > id = config->aggr_map->map[s];
> > > evlist__for_each_entry(evlist, counter) {
> > > val = 0;
> > > - for (cpu = 0; cpu < evsel__nr_cpus(counter); cpu++) {
> > > + for (idx = 0; idx < evsel__nr_cpus(counter); idx++) {
> > > + cpu = perf_cpu_map__cpu(evsel__cpus(counter), idx);
> > > s2 = config->aggr_get_id(config, evlist->core.cpus, cpu);
> >
> > Hi Ian,
> >
> > This same pattern of looping over the CPUs and calling aggr_get_id() is used a couple of
> > other times. For example in aggr_cb() and first_shadow_cpu(). Do you think these also
> > need updating?
>
> Thanks for the feedback James!
>
> For first_shadow_cpu the index is translated to the the cpu via the
> cpu map here:
> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/stat-display.c?h=perf/core#n343
> so I think it is sound.
>
> aggr_cb looks to have the same bug as the index is being passed as the cpu:
> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/stat-display.c?h=perf/core#n643
>
> > Or could we fix it in the aggr_get_id() functions so that they expect an index instead
> > of CPU ID and do the conversion themselves? The callbacks do say "idx" rather than "cpu"
> > so maybe there is still come confusion.
> >
> > For example:
> >
> > perf_stat__get_die_cached(struct perf_stat_config *config,
> > struct perf_cpu_map *map, int idx)
>
> Agreed on the naming confusion. I'm a fan of using single element
> structs to get type safety in code like this. I wonder here if a
> for_each_cpu on perf_cpu_map would clean this up best. I'll play with
> a v2 patch set that addresses this problem more widely.

So the real issue is the cpu map for the evlist (populated with all
CPUs) is being indexed from values from the uncore counter's cpu mask
in aggr_update_shadow. Given the easy confusion of index and cpu, I'm
pushing to eliminate the passing of the map and index in v2.

Thanks,
Ian

> Thanks,
> Ian
>
> > > if (!cpu_map__compare_aggr_cpu_id(s2, id))
> > > continue;
> > > - val += perf_counts(counter->counts, cpu, 0)->val;
> > > + val += perf_counts(counter->counts, idx, 0)->val;
> > > }
> > > perf_stat__update_shadow_stats(counter, val,
> > > first_shadow_cpu(config, counter, id),
> > >