Re: [PATCH -next V7 1/7] riscv: ftrace: Fixup panic by disabling preemption

From: Mark Rutland
Date: Thu Jan 12 2023 - 07:16:52 EST


Hi Guo,

On Thu, Jan 12, 2023 at 04:05:57AM -0500, guoren@xxxxxxxxxx wrote:
> From: Andy Chiu <andy.chiu@xxxxxxxxxx>
>
> In RISCV, we must use an AUIPC + JALR pair to encode an immediate,
> forming a jump that jumps to an address over 4K. This may cause errors
> if we want to enable kernel preemption and remove dependency from
> patching code with stop_machine(). For example, if a task was switched
> out on auipc. And, if we changed the ftrace function before it was
> switched back, then it would jump to an address that has updated 11:0
> bits mixing with previous XLEN:12 part.
>
> p: patched area performed by dynamic ftrace
> ftrace_prologue:
> p| REG_S ra, -SZREG(sp)
> p| auipc ra, 0x? ------------> preempted
> ...
> change ftrace function
> ...
> p| jalr -?(ra) <------------- switched back
> p| REG_L ra, -SZREG(sp)
> func:
> xxx
> ret

As mentioned on the last posting, I don't think this is sufficient to fix the
issue. I've replied with more detail there:

https://lore.kernel.org/lkml/Y7%2F3hoFjS49yy52W@FVFF77S0Q05N/

Even in a non-preemptible SMP kernel, if one CPU can be in the middle of
executing the ftrace_prologue while another CPU is patching the
ftrace_prologue, you have the exact same issue.

For example, if CPU X is in the prologue fetches the old AUIPC and the new
JALR (because it races with CPU Y modifying those), CPU X will branch to the
wrong address. The race window is much smaller in the absence of preemption,
but it's still there (and will be exacerbated in virtual machines since the
hypervisor can preempt a vCPU at any time).

Note that the above is even assuming that instruction fetches are atomic, which
I'm not sure is the case; for example arm64 has special CMODX / "Concurrent
MODification and eXecutuion of instructions" rules which mean only certain
instructions can be patched atomically.

Either I'm missing something that provides mutual exclusion between the
patching and execution of the ftrace_prologue, or this patch is not sufficient.

Thanks,
Mark.

> Fixes: afc76b8b8011 ("riscv: Using PATCHABLE_FUNCTION_ENTRY instead of MCOUNT")
> Signed-off-by: Andy Chiu <andy.chiu@xxxxxxxxxx>
> Signed-off-by: Guo Ren <guoren@xxxxxxxxxx>
> ---
> arch/riscv/Kconfig | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index e2b656043abf..ee0d39b26794 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -138,7 +138,7 @@ config RISCV
> select HAVE_DYNAMIC_FTRACE_WITH_REGS if HAVE_DYNAMIC_FTRACE
> select HAVE_FTRACE_MCOUNT_RECORD if !XIP_KERNEL
> select HAVE_FUNCTION_GRAPH_TRACER
> - select HAVE_FUNCTION_TRACER if !XIP_KERNEL
> + select HAVE_FUNCTION_TRACER if !XIP_KERNEL && !PREEMPTION
>
> config ARCH_MMAP_RND_BITS_MIN
> default 18 if 64BIT
> --
> 2.36.1
>