Re: [patch 20/24] perfmon: system calls interface

From: Ingo Molnar
Date: Thu Nov 27 2008 - 09:42:44 EST



* stephane eranian <eranian@xxxxxxxxxxxxxx> wrote:

> Ingo,
>
> On Wed, Nov 26, 2008 at 3:00 PM, Ingo Molnar <mingo@xxxxxxx> wrote:
> >
> > Thirdly, the check for ->exit_state in pfm_task_incompatible() is
> > not needed: we've just passed ptrace_check_attach() so we know we
> > just transitioned the task to task->state == TASK_TRACED.
> >
> > If you _ever_ see a task exit TASK_TRACED and go zombie or dead
> > from there without this code allowing it that means the whole
> > state machine with ptrace is borked up by perfmon. For example i
> > dont see where the perfmon-control task parents itself as the
> > exclusive debugger (parent) of the debuggee-task.
> >
>
> Perfmon requires ptrace ONLY to stop the thread you want to operate
> on. For instance, to read the counters in a thread via pfm_read(),
> you need to have that thread stopped, so perfmon can extract the
> machine state safely. But when the monitored thread runs, it does
> not have to remain under the control of ptrace. All that is needed
> is that the thread is stopped while we are in the perfmon syscall. I
> think ptrace allows this today. We will be able to drop ptrace()
> once we switch to utrace in which case, the kernel will be able to
> easily stop the thread when entering the perfmon syscalls. I guess I
> don't quite understand the meaning of your last sentence.

The meaning of my last sentence is the jist of my argument: you cannot
do it like this! You are using a bit of the ptrace infrastructure but
unsafely, as pointed out here.

and the thing is, i fail to understand the whole justification of the
new sys_pfm_attach()/PFM_NO_TARGET system calls.

Firstly, there's a taste issue: why didnt you add sys_pfm_detach
instead of adding a butt-ugly PFM_NO_TARGET special case into
sys_pfm_attach() that maps to pfm_detach??

But more importantly, and very fundamentally: why did you implement it
as a special system call? Why didnt you extend ptrace to read/write
the PMU context? It is _trivial_ and needs no new syscalls at all:
just a new ptrace parameter to arch_ptrace(). And ptrace will drive
the TASK_TRACED state machine safely - it already stops/starts tasks
to read/write hardware context safely.

And as a bonus, if this is implemented via a ptrace extension it will
be trivial to add support for these new context types to all sorts of
user-space debuggers as well. With new syscalls it will take ages for
this to trickle through to all parties involved.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/