Re: [PATCH] perf/x86/intel/lbr: fix branch type encoding

From: Stephane Eranian
Date: Mon Aug 15 2022 - 18:33:07 EST


On Sun, Aug 14, 2022 at 12:37 PM Liang, Kan <kan.liang@xxxxxxxxxxxxxxx> wrote:
>
>
>
> On 2022-08-12 4:16 a.m., Andi Kleen wrote:
> >
> >>
> >> I think the option is to avoid the overhead of disassembling of branch
> >> instruction. See eb0baf8a0d92 ("perf/core: Define the common branch type
> >> classification")
> >> "Since the disassembling of branch instruction needs some overhead,
> >> a new PERF_SAMPLE_BRANCH_TYPE_SAVE is introduced to indicate if it
> >> needs to disassemble the branch instruction and record the branch
> >> type."
> >
> >
> > Thanks for digging it out. So it was only performance.
> >
> >>
> >> I have no idea how big the overhead is. If we can always be benefit from
> >> the branch type. I guess we can make it default on.
> >
> > I thought even arch LBR had one case where it had to disassemble, but
> > perhaps it's unlikely enough because it's pre filtered. If yes it may be
> > ok to enable it there unconditionally at the kernel level.
> >
>
> Yes, Arch LBR should have much less overhead than the previous
> platforms. The most common branches, JCC and near JMP/CALL, are from the
> HW. Only the other branches, e.g., far call, SYS* etc, which still rely
> on the SW disassemble. The number of the other branches should not be
> big. I agree that we should enable the branch type for the Arch LBR
> unconditionally at the kernel level.
>
> Peter? Stephane? What do you think?
>
> > Still have to decide if we want older parts to have more overhead by
> > default. I guess would need some data on that.
>
I don't think you want that. It is okay to have it when it is free. Otherwise it
is best if it remains opt-in.
>
> The previous LBR already has high overhead. The branch type overhead
> will make it worse. I think it's better keep it default off. I think we
> can make it clear in the document that the branch type is only default
> on for the new platforms with Arch LBR support (12th-Gen+ client or
> 4th-Gen Xeon+ server).
>
I am okay with that.