Re: [PATCH 08/15] x86/alternatives: Teach text_poke_bp() to emulate instructions

From: Steven Rostedt
Date: Tue Jun 11 2019 - 11:27:28 EST


On Tue, 11 Jun 2019 10:03:07 +0200
Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:


> So what happens is that arch_prepare_optimized_kprobe() <-
> copy_optimized_instructions() copies however much of the instruction
> stream is required such that we can overwrite the instruction at @addr
> with a 5 byte jump.
>
> arch_optimize_kprobe() then does the text_poke_bp() that replaces the
> instruction @addr with int3, copies the rel jump address and overwrites
> the int3 with jmp.
>
> And I'm thinking the problem is with something like:
>
> @addr: nop nop nop nop nop

What would work would be to:

add breakpoint to first opcode.

call synchronize_tasks();

/* All tasks now hitting breakpoint and jumping over affected
code */

update the rest of the instructions.

replace breakpoint with jmp.

One caveat is that the replaced instructions must not be a call
function. As if the call function calls schedule then it will
circumvent the synchronize_tasks(). It would be OK if that call is the
last of the instructions. But I doubt we modify anything more then a
call size anyway, so this should still work for all current instances.

-- Steve

>
> We copy out the nops into the trampoline, overwrite the first nop with
> an INT3, overwrite the remaining nops with the rel addr, but oops,
> another CPU can still be executing one of those NOPs, right?
>
> I'm thinking we could fix this by first writing INT3 into all relevant
> instructions, which is going to be messy, given the current code base.