[PATCH] riscv: eliminate unreliable __builtin_frame_address(1)

From: Changbin Du
Date: Mon Jan 17 2022 - 10:44:55 EST


I tried different pieces of code which uses __builtin_frame_address(1)
(with both gcc version 7.5.0 and 10.3.0) to verify whether it works as
expected on riscv64. The result is negative.

What the compiler had generated is as below:
31 fp = (unsigned long)__builtin_frame_address(1);
0xffffffff80006024 <+200>: ld s1,0(s0)

It takes '0(s0)' as the address of frame 1 (caller), but the actual address
should be '-16(s0)'.

| ... | <-+
+-----------------+ |
| return address | |
| previous fp | |
| saved registers | |
| local variables | |
$fp --> | ... | |
+-----------------+ |
| return address | |
| previous fp --------+
| saved registers |
$sp --> | local variables |
+-----------------+

This leads the kernel can not dump the full stack trace on riscv.

[ 7.222126][ T1] Call Trace:
[ 7.222804][ T1] [<ffffffff80006058>] dump_backtrace+0x2c/0x3a

This problem is not exposed on most riscv builds just because the '0(s0)'
occasionally is the address frame 2 (caller's caller), if only ra and fp
are stored in frame 1 (caller).

| ... | <-+
+-----------------+ |
| return address | |
$fp --> | previous fp | |
+-----------------+ |
| return address | |
| previous fp --------+
| saved registers |
$sp --> | local variables |
+-----------------+

This could be a *bug* of gcc that should be fixed. But as noted in gcc
manual "Calling this function with a nonzero argument can have
unpredictable effects, including crashing the calling program.", let's
remove the '__builtin_frame_address(1)' in backtrace code.

With this fix now it can show full stack trace:
[ 10.444838][ T1] Call Trace:
[ 10.446199][ T1] [<ffffffff8000606c>] dump_backtrace+0x2c/0x3a
[ 10.447711][ T1] [<ffffffff800060ac>] show_stack+0x32/0x3e
[ 10.448710][ T1] [<ffffffff80a005c0>] dump_stack_lvl+0x58/0x7a
[ 10.449941][ T1] [<ffffffff80a005f6>] dump_stack+0x14/0x1c
[ 10.450929][ T1] [<ffffffff804c04ee>] ubsan_epilogue+0x10/0x5a
[ 10.451869][ T1] [<ffffffff804c092e>] __ubsan_handle_load_invalid_value+0x6c/0x78
[ 10.453049][ T1] [<ffffffff8018f834>] __pagevec_release+0x62/0x64
[ 10.455476][ T1] [<ffffffff80190830>] truncate_inode_pages_range+0x132/0x5be
[ 10.456798][ T1] [<ffffffff80190ce0>] truncate_inode_pages+0x24/0x30
[ 10.457853][ T1] [<ffffffff8045bb04>] kill_bdev+0x32/0x3c
...

Signed-off-by: Changbin Du <changbin.du@xxxxxxxxx>
---
arch/riscv/kernel/stacktrace.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/riscv/kernel/stacktrace.c b/arch/riscv/kernel/stacktrace.c
index 201ee206fb57..14d2b53ec322 100644
--- a/arch/riscv/kernel/stacktrace.c
+++ b/arch/riscv/kernel/stacktrace.c
@@ -22,15 +22,16 @@ void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs,
bool (*fn)(void *, unsigned long), void *arg)
{
unsigned long fp, sp, pc;
+ int level = 0;

if (regs) {
fp = frame_pointer(regs);
sp = user_stack_pointer(regs);
pc = instruction_pointer(regs);
} else if (task == NULL || task == current) {
- fp = (unsigned long)__builtin_frame_address(1);
- sp = (unsigned long)__builtin_frame_address(0);
- pc = (unsigned long)__builtin_return_address(0);
+ fp = (unsigned long)__builtin_frame_address(0);
+ sp = sp_in_global;
+ pc = (unsigned long)walk_stackframe;
} else {
/* task blocked in __switch_to */
fp = task->thread.s[0];
@@ -42,7 +43,7 @@ void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs,
unsigned long low, high;
struct stackframe *frame;

- if (unlikely(!__kernel_text_address(pc) || !fn(arg, pc)))
+ if (unlikely(!__kernel_text_address(pc) || (level++ >= 1 && !fn(arg, pc))))
break;

/* Validate frame pointer */
--
2.32.0