[BUG] perf/x86/intel/pebs: PEBS timestamps overwritten

From: Stephane Eranian
Date: Fri Aug 05 2022 - 06:50:09 EST


Hi,

I was alerted by an internal user that the PEBS TSC-based timestamps
do not appear
correctly in the final perf.data output file from perf record.

After some investigation, I came to the conclusion that indeed the
data->time field setup
by PEBS in the setup_pebs_fixed_sample_data() is later overwritten by
perf_events generic
code in perf_prepare_sample(). There is an ordering problem here.

Looking around we found that this problem had been uncovered back in
May 2020 and a
patch had been posted then:
https://lore.kernel.org/lkml/e754b625-bf14-8f5f-bd1a-71d774057005@xxxxxxxxx/T/

However this patch was never commented upon or committed.

The problem is still present in the upstream code today.

1. perf_sample_data_init()
2. setup_pebs_fixed_sample_data(): data->time =
native_sched_clock_from_tsc(pebs->tsc);
3. perf_prepare_sample(): data->time = perf_event_clock(event);

The patch from 2020 (Andreas Kogler) fixes the problem by making the
assignment in 3.
conditioned to the value of data->time being 0. Andreas also suggested
an alternative which
would break up the call to perf_event_ouput() like this is done in the
BTS code allowing
the prepare_sample() call to be made before PEBS samples are
extracted. That would
generate some code duplication. Although this approach appears more
robust, the one
issue I see is that prepare_sample may need data that would be filled
by PEBS and
therefore it would need to be called afterwards.

Any better ideas?
Thanks.