Re: [PATCH 1/1] system call notification with self_ptrace

From: Pierre Morel
Date: Fri Sep 12 2008 - 08:24:30 EST


Hello Oleg,

You are right, the functionality can be implemented with the system call.
But it means we have the overhead of a system call just to clear two bits,
the TIF_SYSCALL_TRACE and the PTS_SELF.

On the other hand we have an overhead of one single "if" inside
the handle_signal() function.

We can do the same with fork and ptrace, yes, but with a very big overhead on each system call and this is why this patch is so usefull: because with this patch you sit inside the thread when analysing it and have a direct access to all data without the need of IPC, ptrace or any task switch.

I will provide a test program and plan to release a tracing tool based on it.
I think I can reduce the task struct modification by using just a bit like you suggest if nobody seen any problem with this.

best regards,

Pierre

Oleg Nesterov wrote:
On 09/10, Pierre Morel wrote:
Oleg Nesterov wrote:
I still think this patch shouldn't change handle_signal().

Once again. The signal handler for SIGSYS can first do
sys_ptrace(PTRACE_SELF_OFF) (which is filtered out), and then use any
other syscall, so this change is not needed, afaics.

Yes it can but what if the application forget to do it?
It is a security so that the application do not bounce for ever.

The (buggy) task can be killed, this has nothing to do with security.

From the security pov, this case doesn't differ from, say,

void sigh(int sig)
{
kill(getpid(), sig);
}

void main(void)
{
signal(SIGSYS, sigh);
kill(getpid(), SIGSYS);
}

Or I missed something?

So, PTRACE_SELF_XXX disables the "normal" ptrace. Not sure this is good.

I think that having two tracing system one over the other may be
quite difficult to handle.

Yes I see.

But... well, I think we need Roland's opinion. I must admit, I am a bit
sceptical about this patch ;) I mean, I don't really understand why it
is useful. We can do the same with fork() + ptrace(). Yes, in that
case we need an "extra" context switch for any traced syscall. But,
do you have any "real life" example to demonstrate that the user-space
solution sucks? We can even use CLONE_MM to speedup the context switch.

Pierre, don't get me wrong. I never used debuggers for myself, I will
be happy to know I am wrong. I just don't understand.


As for ->instrumentation. If you are going to remove PTS_INSTRUMENTED,
we need only one bit. We could use PF_PTS_SELF, but ->flags is already
"contended". Perhaps you can do something like

--- include/linux/sched.h
+++ include/linux/sched.h
@@ -1088,6 +1088,7 @@ struct task_struct {
/* ??? */
unsigned int personality;
unsigned did_exec:1;
+ unsigned pts_self:1;
pid_t pid;
pid_t tgid;


Both did_exec and pts_self can only be changed by current, so it is
safe to share the same word. This way we don't enlarge task_struct.

Oleg.



--
=============
Pierre Morel
RTOS and Embedded Linux

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/