Re: [PATCH v3 0/6] Static calls

From: Steven Rostedt
Date: Mon Feb 17 2020 - 16:57:20 EST


On Mon, 17 Feb 2020 22:10:27 +0100
Jann Horn <jannh@xxxxxxxxxx> wrote:

> On Thu, Jan 10, 2019 at 9:52 PM Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote:
> > On Thu, Jan 10, 2019 at 09:30:23PM +0100, Peter Zijlstra wrote:
> > > On Wed, Jan 09, 2019 at 04:59:35PM -0600, Josh Poimboeuf wrote:
> > > > With this version, I stopped trying to use text_poke_bp(), and instead
> > > > went with a different approach: if the call site destination doesn't
> > > > cross a cacheline boundary, just do an atomic write. Otherwise, keep
> > > > using the trampoline indefinitely.
> > >
> > > > - Get rid of the use of text_poke_bp(), in favor of atomic writes.
> > > > Out-of-line calls will be promoted to inline only if the call sites
> > > > don't cross cache line boundaries. [Linus/Andy]
> > >
> > > Can we perserve why text_poke_bp() didn't work? I seem to have forgotten
> > > again. The problem was poking the return address onto the stack from the
> > > int3 handler, or something along those lines?
> >
> > Right, emulating a call instruction from the #BP handler is ugly,
> > because you have to somehow grow the stack to make room for the return
> > address. Personally I liked the idea of shifting the iret frame by 16
> > bytes in the #DB entry code, but others hated it.
> >
> > So many bad-but-not-completely-unacceptable options to choose from.
>
> Silly suggestion from someone who has skimmed the thread:
>
> Wouldn't a retpoline-style trampoline solve this without needing
> memory allocations? Let the interrupt handler stash the destination in
> a percpu variable and clear IF in regs->flags. Something like:

Linus actually suggested something similar, but for ftrace, but after
implementing it, it was hard to get right, and caused havoc with
utilities like lockdep, and also shadow stacks.

See this thread:

https://lore.kernel.org/linux-kselftest/CAHk-=wh5OpheSU8Em_Q3Hg8qw_JtoijxOdPtHru6d+5K8TWM=A@xxxxxxxxxxxxxx/


-- Steve