Re: [PATCH v6 2/6] arm64: ptrace: allow tracer to skip a system call

From: AKASHI Takahiro
Date: Wed Oct 01 2014 - 07:08:20 EST


Will,

When I was looking into syscall_trace_exit() more closely, I found
another (big) problem.
There are two system calls, execve() and rt_sigreturn(), which change
'syscallno' in pt_regs to -1 in start_thread() and restore_sigframe(),
respectively.

Since syscallno is not valid anymore in syscall_trace_exit() for these
system calls, we cannot create a correct syscall exit record for tracepoint
in trace_sys_exit() (=> ftrace_syscall_exit()) and for audit in audit_syscall_exit().

This does not happen on arm because syscall numbers are kept in
thread_info on arm.

How can we deal with this issue?

-Takahiro AKASHI


On 08/27/2014 02:51 AM, Will Deacon wrote:
On Fri, Aug 22, 2014 at 01:35:17AM +0100, AKASHI Takahiro wrote:
On 08/22/2014 02:08 AM, Kees Cook wrote:
On Thu, Aug 21, 2014 at 3:56 AM, AKASHI Takahiro
<takahiro.akashi@xxxxxxxxxx> wrote:
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index 8876049..c54dbcc 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -1121,9 +1121,29 @@ static void tracehook_report_syscall(struct pt_regs *regs,

asmlinkage int syscall_trace_enter(struct pt_regs *regs)
{
+ unsigned int saved_syscallno = regs->syscallno;
+
if (test_thread_flag(TIF_SYSCALL_TRACE))
tracehook_report_syscall(regs, PTRACE_SYSCALL_ENTER);

+ if (IS_SKIP_SYSCALL(regs->syscallno)) {
+ /*
+ * RESTRICTION: we can't modify a return value of user
+ * issued syscall(-1) here. In order to ease this flavor,
+ * we need to treat whatever value in x0 as a return value,
+ * but this might result in a bogus value being returned.
+ */
+ /*
+ * NOTE: syscallno may also be set to -1 if fatal signal is
+ * detected in tracehook_report_syscall_entry(), but since
+ * a value set to x0 here is not used in this case, we may
+ * neglect the case.
+ */
+ if (!test_thread_flag(TIF_SYSCALL_TRACE) ||
+ (IS_SKIP_SYSCALL(saved_syscallno)))
+ regs->regs[0] = -ENOSYS;
+ }
+

I don't have a runtime environment yet for arm64, so I can't test this
directly myself, so I'm just trying to eyeball this. :)

Once the seccomp logic is added here, I don't think using -2 as a
special value will work. Doesn't this mean the Oops is possible by the
user issuing a "-2" syscall? As in, if TIF_SYSCALL_WORK is set, and
the user passed -2 as the syscall, audit will be called only on entry,
and then skipped on exit?

Oops, you're absolutely right. I didn't think of this case.
syscall_trace_enter() should not return a syscallno directly, but always
return -1 if syscallno < 0. (except when secure_computing() returns with -1)
This also implies that tracehook_report_syscall() should also have a return value.

Will, is this fine with you?

Well, the first thing that jumps out at me is why this is being done
completely differently for arm64 and arm. I thought adding the new ptrace
requests would reconcile the differences?

Will

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/