Re: [PATCH v2 4/7] powerpc/ftrace: Additionally nop out the preceding mflr with -mprofile-kernel

From: Naveen N. Rao
Date: Thu Jun 27 2019 - 10:51:07 EST


Naveen N. Rao wrote:
With -mprofile-kernel, gcc emits 'mflr r0', followed by 'bl _mcount' to
enable function tracing and profiling. So far, with dynamic ftrace, we
used to only patch out the branch to _mcount(). However, mflr is
executed by the branch unit that can only execute one per cycle on
POWER9 and shared with branches, so it would be nice to avoid it where
possible.

We cannot simply nop out the mflr either. When enabling function
tracing, there can be a race if tracing is enabled when some thread was
interrupted after executing a nop'ed out mflr. In this case, the thread
would execute the now-patched-in branch to _mcount() without having
executed the preceding mflr.

To solve this, we now enable function tracing in 2 steps: patch in the
mflr instruction, use 'smp_call_function(isync);
synchronize_rcu_tasks()' to ensure all existing threads make progress,
and then patch in the branch to _mcount(). We override
ftrace_replace_code() with a powerpc64 variant for this purpose.

Suggested-by: Nicholas Piggin <npiggin@xxxxxxxxx>
Reviewed-by: Nicholas Piggin <npiggin@xxxxxxxxx>
Signed-off-by: Naveen N. Rao <naveen.n.rao@xxxxxxxxxxxxxxxxxx>
---
arch/powerpc/kernel/trace/ftrace.c | 258 ++++++++++++++++++++++++++---
1 file changed, 236 insertions(+), 22 deletions(-)


I missed adding a comment here to explain the changes. As discussed in the previous series, I think we are ok with this patch from a CMODX perspective. For smp_call_function(), I decided to have it included in this patch since we know that we need it here for sure. I am not entirely sure we want to do that in patch_instruction() since ftrace doesn't seem to need it elsewhere. As Nick Piggin pointed out, we may want to have users of patch_instruction() (kprobes) add the necessary synchronization.


Thanks,
Naveen