Re: [PATCH V7 13/17] perf, x86: enable LBR callstack when recording callchain

From: Stephane Eranian
Date: Wed Nov 05 2014 - 05:57:15 EST


On Wed, Nov 5, 2014 at 11:43 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Wed, Nov 05, 2014 at 10:58:28AM +0100, Stephane Eranian wrote:
>> On Wed, Nov 5, 2014 at 10:21 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>> > On Tue, Nov 04, 2014 at 09:56:09PM -0500, Kan Liang wrote:
>> >> From: Yan, Zheng <zheng.z.yan@xxxxxxxxx>
>> >>
>> >> Only enable LBR callstack when user requires fp callgraph. The feature
>> >> is not available when PERF_SAMPLE_BRANCH_STACK or PERF_SAMPLE_STACK_USER
>> >> is required.
>> >> Also, this feature only affects how to get user callchain. The kernel
>> >> callchain is always got by frame pointers.
>> >
>> > Since FP callchains should not change, this doesn't appear to make any
>> > sense either.
>>
>> If I recall earlier discussion, the FP callchain are not changed. On
>> HSW, when requesting fp at the user level only, then the kernel
>> automatically tries to use the LBR callstack mode. Advantage is that
>> the user app does not require frame-pointer or dwarf debug info to get
>> correct callchains with perf record. The downside is that LBR
>> callstack does not work in certain callchain corner cases.
>
> But this patch changes the FP callchain interface. I see no need of
> that. We already have multiple independent callchain options (FP and
> Dwarf) adding a third option should also be independent (LBR).
>
> Allowing all 3 at the same time allows for identifying those corner
> cases.
>
> That is I simply don't see a good reason intertwine these things at the
> interface level. All it does is reduce options. Would it not be 'nice'
> to allow both FP and LBR at the same time?

Yes, but I wonder how would the tool sort this out if you have FP and LBR
for each sample.

My understanding of the patch is that it does not change the user interface,
it changes the way callchains are gathered by the kernel on HSW.

Is there explicit mention in the API that CALLCHAIN is relying on FP?

I think in general it would be better for tools to know which
low-level mechanism
is used to better interpret the results and especially be aware of the
limitations of
each mechanism.

I think the patch is trying some auto-promotion of CALLCHAIN to FP based
on the belief it is better in most cases. It reminds me of the discussion about
precise mode. Why not default to precise for all events that support it?

I would be okay if the patch was introducing the 3rd mode for callchains.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/