Re: [PATCH 2/4] perf/x86: add support for PERF_SAMPLE_BRANCH_CALL

From: Andi Kleen
Date: Tue Oct 13 2015 - 11:40:48 EST


> I'm wondering how frequent zero-length calls are. If they still occur in typical
> user-space, would it make sense to also have a separate branch sampling type for
> zero length calls?

Apparently not too old icc compiled 32bit PIC binaries still contain it.
For gcc it was fixed for much longer.

But I'm not sure it's that interesting to sample by itself.

> push the current IP on the stack:
>
> call next_addr
> next_addr:
> pop %reg
>
> which can take over 10 cycles on certain microarchitectures (and it unbalances
> whatever call stack tracking/caching the CPU does as well).
>
> So it might make sense to analyze them separately. I guess that's the reason why
> Intel added a separate flag for them in the PMU.

X86_BR_ZERO_CALL is only a software filter. There's no direct support for it
in the Intel hardware. It was added to make the LBR call stack more reliable,
which otherwise gets messed up by the zero length calls.

-Andi

--
ak@xxxxxxxxxxxxxxx -- Speaking for myself only
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/