RE: [PATCH v4 00/22] perf: Add infrastructure and support for Intel PT

From: Michael Williams
Date: Mon Sep 08 2014 - 09:09:41 EST



Alexander Shishkin wrote:
> Mathieu Poirier <mathieu.poirier@xxxxxxxxxx> writes:
>
>> On 4 September 2014 02:26, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>>> On Tue, Sep 02, 2014 at 02:18:16PM -0600, Mathieu Poirier wrote:
>>>> Pawell, many thanks for looping me in.
>>>>
>>>> I am definitely not a perf-internal guru and as such won't be able to
>>>> comment on the implementation. On the flip side it is easy for me to
>>>> see how the work on coresight done at Linaro can be made to tie-in
>>>> what Alexander is proposing. Albeit not at the top of the priority
>>>> list at this time, integration with perf (and ftrace) is definitely
>>>> on the roadmap.
>>>>
>>>> Powell is correct in his statement that Linaro's work in HW trace
>>>> decoding is (currently) mainly focused on processor tracing but that
>>>> will change when we have the basic infrastructure upstreamed.
>>>>
>>>> Last but not least it would be interesting to have more information
>>>> on the "sideband data". With coresight we have something called
>>>> "metadata", also related to how the trace source was configured and
>>>> instrumental to proper trace decoding. I'm pretty sure we are facing
> the same problems.
>>>
>>> So we use the sideband or AUX data stream to export the 'big' data
>>> stream generated by the CPU in an opaque manner. For every AUX data
>>> block 'posted' we issue an event into the regular data buffer that
>>> describes it.
>>
>> In the context of "describe it" written above, what kind of
>> information one would typically find in that description?
>
> It's like "got a chunk of AUX data in the AUX buffer, starting at offset
> $X, length $Y".
>
>>>
>>> I was assuming that both ARM and MIPS would generate a single data
>>> stream as well. So please do tell more about your meta-data; is that
>>> a one time thing or a second continuous stream of data, albeit
>>> smaller than the main stream?
>>
>> Coresight does indeed generate a single steam of compressed data.
>> Depending on the tracing specifics that were requested by the use case
>> (trace engine configuration) the format of the packets in the
>> compressed stream will change. Since the compressed stream itself
>> doesn't carry clues about the formatting information, knowledge of how
>> the trace engine was configured is mandatory for the proper decoding
>> of the trace stream.
>
> Ok, in perf the trace configuration would be part of 'session'
> information, so the way the tracing was configured by userspace will be
> saved to the resulting trace file (perf.data) by the userspace.
> We have that with Intel PT as well.
>
>> Metadata refer to exactly that - the configuration of the trace
>> engine. It has to be somehow lumped with the trace stream for off
>> target analysis.
>
> What we call sideband data in our code is more like runtime metadata,
> such as executable mappings (so that you know what is mapped to which
> addresses, iirc ETM/PTM also deals in virtual addresses, so you'll need
> this information to make sense of the trace, right?) and context
> switches.
>
> One of the great things about perf here is that it provides all this
> information practically for free.
>
>>>
>>> The way I read your explanation it is a one time blob generated once
>>> you setup the hardware.
>>
>> Correct.
>>
>>> I suppose we could either dump it once into the normal data stream or
>>> maybe dump it once every time we generate an AUX buffer event into
>>> the normal data stream -- if its not too big.
>>
>> Right, there is a set of meta-data to be generated with each trace
>> run. With the current implementation a "trace run" pertains to all
>> the information collected between the beginning and end of a trace
>> scenario. Future work involve triggering a DMA transfer of the full
>> coresight buffer to a kernel memory area, something that is probably
>> close to the "buffer event" you are referring to.
>
> Again correct me if I'm wrong, but the TMC(?) controller can be
> configured to direct ETM/PTM output right into system memory by means of
> a scatter-gather table. This is what we call AUX area, it's basically a
> circular buffer with trace data. Trace output is sent to the system
> memory, which is also mmap()ed to the userspace tracing tool (perf), so
> that it can capture it in real time. Well, that's one of the scenarios.

Correct. However there are two provisos:

Firstly, not all systems will have the Trace Memory Controller (TMC) in the Embedded Trace Router (ETR) configuration that can write directly to system memory. Many systems have a dedicated Embedded Trace Buffer (ETB). This is a dedicated SRAM for collecting trace that is not memory-mapped. It can only be accessed by the driver. Getting the data out is quick compared to reconstructing the trace. Systems can have a combination of multiple ETRs, ETBs, ...

Secondly, ETR/ETB etc. might contain multiple traces from different sources and different processors multiplexed together with the Trace Wrapping Protocol (TWP). For security reasons you might decide to unwrap this in privileged mode code, not in userspace. Again this is quick compared to reconstructing the trace.

So you might need to support copying the data out into a userspace buffer.

Mike.
--
Michael Williams, Principal Engineer, ARM Limited, Cambridge UK
ARM: The Architecture for the Digital World http://www.arm.com



-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2557590
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2548782

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/