[RFC] desired behaviour of syscall tracing wrt fork()

From: Al Viro
Date: Sat Oct 13 2012 - 15:51:13 EST


There's a lovely incosistency regarding whether we call
trace_sys_exit() for child process on return from fork()/clone()/etc.
The current situation:
* called on amd64 for 32bit newborns
* *NOT* called on i386 or amd64 for 64bit ones
* not called on arm
* called on ppc, s390, sh and sparc64
* not wired on anything else
Note that existing in-kernel users of that tracepoint (ftrace and perf)
both at least attempt to bail out in that situation. However, the
way it's done is not guaranteed to work if we wire more architectures -
it relies on syscall_get_nr() returning negative in child, which might
or might not work everywhere. If nothing else, it's a landmine to
avoid...

FWIW, I'd vote for not calling syscall_trace_...() on the way
from ret_from_fork() - nothing in there really wants to be called
for newborns; e.g. TIF_SYSCALL_TRACE is explicitly turned off for
newborns, audit_syscall_exit() will not see ->in_syscall set and will
log nothing and existing users of trace_sys_exit() at least attempt
to skip doing anything on those.

Comments? AFAICS, it's not that much surgery to do...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/