Re: [PATCH 3/3] x86/static_call: Add support for Jcc tail-calls

From: Peter Zijlstra
Date: Tue Jan 24 2023 - 08:07:20 EST


On Mon, Jan 23, 2023 at 05:44:31PM -0500, Steven Rostedt wrote:
> On Mon, 23 Jan 2023 21:59:18 +0100
> Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> > Clang likes to create conditional tail calls like:
> >
> > 0000000000000350 <amd_pmu_add_event>:
> > 350: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) 351: R_X86_64_NONE __fentry__-0x4
> > 355: 48 83 bf 20 01 00 00 00 cmpq $0x0,0x120(%rdi)
> > 35d: 0f 85 00 00 00 00 jne 363 <amd_pmu_add_event+0x13> 35f: R_X86_64_PLT32 __SCT__amd_pmu_branch_add-0x4
> > 363: e9 00 00 00 00 jmp 368 <amd_pmu_add_event+0x18> 364: R_X86_64_PLT32 __x86_return_thunk-0x4
> >
>
> Just to confirm, as it's not clear if this is the static call site or one
> of the functions that is being called.

Ah, you've not looked at enough asm then? ;-) Yes this is the static
call site, see the __SCT_ target (instruction at 0x35d).

> I'm guessing that this is an issue because clang optimizes the static call
> site, right?

Specifically using Jcc (jne in this case) to tail-call the trampoline.

> > Teach the in-line static call text patching about this.
> >
> > Notably, since there is no conditional-ret, in that caes patch the Jcc
>
> "in that case"

typing so hard.. :-)