Re: [PATCH v2 0/7] Implement inline static calls on PPC32 - v2

From: Ard Biesheuvel
Date: Sat Jul 09 2022 - 02:52:54 EST


Hello Christophe,

On Fri, 8 Jul 2022 at 19:32, Christophe Leroy
<christophe.leroy@xxxxxxxxxx> wrote:
>
> This series applies on top of the series v3 "objtool: Enable and
> implement --mcount option on powerpc" [1] rebased on powerpc-next branch
>
> A few modifications are done to core parts to enable powerpc
> implementation:
> - R_X86_64_PC32 is abstracted to R_REL32 so that it can then be
> redefined as R_PPC_REL32.
> - A call to static_call_init() is added to start_kernel() to avoid
> every architecture to have to call it
> - Trampoline address is provided to arch_static_call_transform() even
> when setting a site to fallback on a call to the trampoline when the
> target is too far.
>
> [1] https://lore.kernel.org/lkml/70b6d08d-aced-7f4e-b958-a3c7ae1a9319@xxxxxxxxxx/T/#rb3a073c54aba563a135fba891e0c34c46e47beef
>
> Christophe Leroy (7):
> powerpc: Add missing asm/asm.h for objtool
> objtool/powerpc: Activate objtool on PPC32
> objtool: Add architecture specific R_REL32 macro
> objtool/powerpc: Add necessary support for inline static calls
> init: Call static_call_init() from start_kernel()
> static_call_inline: Provide trampoline address when updating sites
> powerpc/static_call: Implement inline static calls
>

Could you quantify the performance gains of moving from out-of-line,
patched tail-call branch instructions to full-fledged inline static
calls? On x86, the retpoline problem makes this glaringly obvious, but
on other architectures, the complexity of supporting this model may
outweigh the performance advantages.