Re: [PATCH 2/2] perf cs-etm: Set time on synthesised samples to preserve ordering

From: James Clark
Date: Fri Apr 16 2021 - 05:55:50 EST




On 15/04/2021 17:33, Leo Yan wrote:
> Hi James,
>
> On Thu, Apr 15, 2021 at 03:51:46PM +0300, James Clark wrote:
>
> [...]
>
>>> For the orignal perf data file with "--per-thread" option, the decoder
>>> runs into the condition for "etm->timeless_decoding"; and it doesn't
>>> contain ETM timestamp.
>>>
>>> Afterwards, the injected perf data file also misses ETM timestamp and
>>> hit the condition "etm->timeless_decoding".
>>>
>>> So I am confusing why the original perf data can be processed properly
>>> but fails to handle the injected perf data file.
>>
>> Hi Leo,
>>
>> My patch only deals with per-cpu mode. With per-thread mode everything is already working
>> because _none_ of the events have timestamps because they are not enabled by default:
>>
>> /* In per-cpu case, always need the time of mmap events etc */
>> if (!perf_cpu_map__empty(cpus))
>> evsel__set_sample_bit(tracking_evsel, TIME);
>>
>> When none of the events have timestamps, I think perf doesn't use the ordering code in
>> ordered-events.c. So when the inject file is opened, the events are read in file order.
>
> The explination makes sense to me. One thinking: if the original file
> doesn't use the ordered event, is it possible for the injected file to
> not use the ordered event as well?

Yes if you inject on a file with no timestamps and then open it, then the function queue_event()
in ordered_events.c is not hit.

If you create a file based on one with timestamps, then the queue_event() function is hit
even on the injected file.

The relevant bit of code is here:

if (tool->ordered_events) {
u64 timestamp = -1ULL;

ret = evlist__parse_sample_timestamp(evlist, event, &timestamp);
if (ret && ret != -1)
return ret;

ret = perf_session__queue_event(session, event, timestamp, file_offset);
if (ret != -ETIME)
return ret;
}

return perf_session__deliver_event(session, event, tool, file_offset);

If tool->ordered_events is set AND the timestamp for the sample parses to be non zero
and non -1:

if (!timestamp || timestamp == ~0ULL)
return -ETIME;

Then the event is added into the queue, otherwise it goes straight through to perf_session__deliver_event()
The ordering can be disabled manually with tool->ordered_events and --disable-order and is also disabled
with --dump-raw-trace.

It seems like processing the file only really works when all events are unordered but in the right order,
or ordered with the right timestamps set.

>
> Could you confirm Intel-pt can work well for per-cpu mode for inject
> file?

Yes it seems like synthesised samples are assigned sensible timestamps.

perf record -e intel_pt//u top
perf inject -i perf.data -o perf-intel-per-cpu.inject.data --itrace=i100i --strip
perf report -i perf-intel-per-cpu.inject.data -D

Results in the correct binary and DSO names and the SAMPLE timestamp is after the COMM:

0 381165621595220 0x1200 [0x38]: PERF_RECORD_COMM exec: top:20173/20173

...

2 381165622169297 0x13b0 [0x38]: PERF_RECORD_SAMPLE(IP, 0x2): 20173/20173: 0x7fdaa14abf53 period: 100 addr: 0
... thread: top:20173
...... dso: /lib/x86_64-linux-gnu/ld-2.27.so

Per-thread also works, but no samples or events have timestamps.

>
>> So it's not really about --per-thread vs per-cpu mode, it's actually about whether
>> PERF_SAMPLE_TIME is set, which is set as a by-product of per-cpu mode.
>>
>> I hope I understood your question properly.
>
> Thanks for info, sorry if I miss any info you have elaborated.
>
> Leo
>