Re: [PATCH] perf: sample after exit loses thread correlation

From: Adrian Hunter
Date: Mon Jul 29 2013 - 06:29:48 EST


On 27/07/2013 1:04 a.m., David Ahern wrote:
Occassionally events (e.g., context-switch, sched tracepoints) are losing
the conversion of sample data associated with a thread. For example:

$ perf record -e sched:sched_switch -c 1 -a -- sleep 5
$ perf script
<selected events shown>
ls 30482 [000] 1379727.583037: sched:sched_switch: prev_comm=ls prev_pid=30482 ...
ls 30482 [000] 1379727.586339: sched:sched_switch: prev_comm=ls prev_pid=30482 ...
:30482 30482 [000] 1379727.589462: sched:sched_switch: prev_comm=ls prev_pid=30482 ...

The last line lost the conversion from tid to comm. If you look at the events
(perf script -D) you see why - SAMPLE event is generated after the EXIT:

0 1379727589449774 0x1540b0 [0x38]: PERF_RECORD_EXIT(30482:30482):(30482:30482)
0 1379727589462497 0x1540e8 [0x80]: PERF_RECORD_SAMPLE(IP, 1): 30482/30482: 0xffffffff816416f1 period: 1 addr: 0
.... thread: :30482:30482

When perf processes the EXIT event the thread is moved to the dead_threads
list. When the SAMPLE event is processed no thread exists for the pid so a new
one is created by machine__findnew_thread.

In the case of per-cpu recording, it is normal for sample events to occur after the EXIT event simply because the EXIT event is recorded while the task is still running (in kernel). It is therefore a mistake to move the thread to dead_threads just because of the EXIT event.

Instead the thread should be marked as exiting and not moved to dead_threads until another thread starts on the same CPU. i.e. a comm, mmap, fork event with the same tid and same CPU, or any event with a different tid and same CPU.


This patch addresses the problem by saving the exit time and checking the
dead_threads list for the requested tid. If the time passed into
machine_findnew_thread is non-0 the dead_threads list is walked looking for
the tid. If the thread struct associated with the tid exited within 1 msec
of the time passed in the thread is considered a match and returned.

If samples do not contain timestamps then sample->time will be 0 and the
dead_threads list will not be checked. -1 can be used to force always checking
the dead_threads list and returning a match.

With this patch we get the previous example shows:

ls 30482 [000] 1379727.583037: sched:sched_switch: prev_comm=ls prev_pid=30482 ...
ls 30482 [000] 1379727.586339: sched:sched_switch: prev_comm=ls prev_pid=30482 ...
ls 30482 [000] 1379727.589462: sched:sched_switch: prev_comm=ls prev_pid=30482 ...

and

0 1379727589449774 0x1540b0 [0x38]: PERF_RECORD_EXIT(30482:30482):(30482:30482)
0 1379727589462497 0x1540e8 [0x80]: PERF_RECORD_SAMPLE(IP, 1): 30482/30482: 0xffffffff816416f1 period: 1 addr: 0
.... thread: ls:30482

v2: Rebased to latest perf/core branch. Changed time comparison to use
a macro which explicitly shows the time basis

Signed-off-by: David Ahern<dsahern@xxxxxxxxx>
Cc: Frederic Weisbecker<fweisbec@xxxxxxxxx>
Cc: Ingo Molnar<mingo@xxxxxxxxxx>
Cc: Jiri Olsa<jolsa@xxxxxxxxxx>
Cc: Mike Galbraith<efault@xxxxxx>
Cc: Namhyung Kim<namhyung@xxxxxxxxxx>
Cc: Peter Zijlstra<peterz@xxxxxxxxxxxxx>
Cc: Stephane Eranian<eranian@xxxxxxxxxx>
---
tools/perf/builtin-inject.c | 2 +-
tools/perf/builtin-kmem.c | 3 +--
tools/perf/builtin-kvm.c | 2 +-
tools/perf/builtin-lock.c | 2 +-
tools/perf/builtin-sched.c | 16 +++++++--------
tools/perf/builtin-script.c | 2 +-
tools/perf/builtin-trace.c | 6 +++---
tools/perf/perf.h | 3 +++
tools/perf/tests/hists_link.c | 2 +-
tools/perf/util/build-id.c | 9 ++++-----
tools/perf/util/event.c | 10 ++++++++--
tools/perf/util/machine.c | 44 ++++++++++++++++++++++++++++++-----------
tools/perf/util/machine.h | 6 ++++--
tools/perf/util/session.c | 2 +-
tools/perf/util/thread.h | 2 +-
15 files changed, 71 insertions(+), 40 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/