KGTP example 1: count the function vfs_read running clock and calledtimes of each program with GDB+Python

From: Hui Zhu
Date: Mon Jun 06 2011 - 11:34:32 EST


Hi,

In the release note of KGTP
201100601(https://lkml.org/lkml/2011/6/1/543), I introduce the
gtpframe_pipe. It supply to get the tracepoint frame entry when kgtp
is running.
I will show a example to count the function vfs_read running clock and
called times of each program with GDB+Python.

Before using the examples, you need to make some preparations.
1. Make sure your kernel is built by GCC 4.5 or later version.
Because Kernel is built by -O2, new version GCC generate more debug
info.
2. Make sure your kernel support Kprobe because KGTP need it.
3. Use the GDB 7.3 or later version. Because old version GDB have a
lot of bugs of tracepoint and miss some feature that we need in the
example. If you use UBUNTU, You can got later version GDB package in
https://lkml.org/lkml/2011/6/4/65
4. Make sure your GDB support Python. To make sure that, you can use
following command in GDB:
(gdb) python print("python is OK.\n")
If GDB output "python is OK.", GDB support Python is OK.
If not, you need rebuild your GDB with config "--with-python".

See https://code.google.com/p/kgtp/wiki/HOWTO#Get_KGTP_through_http to
get howto get, build and insmod KGTP.

After insmod KGTP module you can set the tracepoint. The attachment
"r.source" is the GDB source file for set the tracepoint in the
address that we need.
You can let your GDB set the tracepoint with command:
sudo gdb ./vmlinux -x ./r.source
Or GDB command:
source r.source
To load it.
After set the tracepoint, you can use GDB command "tstart" and "tstop"
to start and stop the trace.

Then we can begin to parse the KGTP tracepoint frame entry with
GDB+Python. The attachemt "r.py" is the Python script that will do
it. You can use the command:
sudo gdb ./vmlinux -x r.py
Or GDB command:
source r.py
When the parse begin, you can use ctrl-c to get the info like:
vfs_read: 1146144473124 ns 54981 times

Top 3 of clock is:
{<unavailable> <repeats 16 times>} 859234925034
"chrome\000\000\000\000\000\000\000\000\000" 286747105396
"Xorg", '\000' <repeats 11 times> 87940392

Top 3 of times is:
"chrome\000\000\000\000\000\000\000\000\000" 35956
"Xorg", '\000' <repeats 11 times> 7915
"pulseaudio\000\000\000\000\000" 2406

Top 3 of clock per times is:
{<unavailable> <repeats 16 times>} 15343480804
"chrome\000\000\000\000\000\000\000\000\000" 7974944
"hald", '\000' <repeats 11 times> 462464

Top 3 of real clock is:
{<unavailable> <repeats 16 times>} 859234925034
"chrome\000\000\000\000\000\000\000\000\000" 286747105396
"Xorg", '\000' <repeats 11 times> 87940392

Top 3 of real clock per times is:
"kscope\000\000\000\000\000\000\000\000\000" 9085
"GoogleTalkPlugi" 2290
"udisks-helper-a" 2158

The clock is the vfs_read exec ns.
The real clock is the vfs_read exec ns that only when the process that
call vfs_read is running.



I will introduce the r.source and r.py in following part:
Following part I will explain r.source:
1 list vfs_read
2 target remote /sys/kernel/debug/gtp
This line connect the interface of KGTP.
3 trace read_write.c:310
4 commands
5 collect ((struct task_struct *)$current_task)->pid
6 collect $no_self_trace
7 end
These lines set tracepoint in the begin of function vfs_read.
Line 5 mean collect current pid of process.
Line 6 mean this tracepoint set in a code that have process context.
Don't need trace the current KGTP user space code such as GDB connect
to the KGTP interface. It introduce in
https://code.google.com/p/kgtp/wiki/HOWTO#Special_trace_state_variable_$no_self_trace
8 trace read_write.c:331
9 commands
10 collect ((struct task_struct *)$current_task)->pid
11 collect ((struct task_struct *)$current_task)->comm
12 collect $no_self_trace
13 end
These lines set tracepoint in the end of function vfs_read. All the
return of vfs_read will trace by this tracepoint.
This tracepoint collect other value that need in Python script.
14 list schedule
15 trace 4247
16 commands
17 collect prev->pid
18 collect next->pid
19 end
This tracepoint trace in the "schedule" function, we will use it to
get which process is running on the CPU.
20 set circular-trace-buffer on
This command set the frame circular. When the frame is full, it will
auto overwrite the old entry.


Following part I will explain r.py:
1 #!/usr/bin/python
2
3 import signal;
4 import gdb;
gdb is a special lib when use python inside GDB, you can get more
about it in http://sourceware.org/gdb/current/onlinedocs/gdb/Python.html
5
6 real_read_clock = 0;
7 read_clock = 0;
8 read_num = 0;
9 read_comm = {};
These vals is for record the info from Kernel.
10
11 def tfind_entry(num):
12 if num < 0:
13 tid = gdb.execute("tfind", False, True);
14 else:
15 tid = gdb.execute("tfind "+str(num), False, True);
16 if tid.find(", tracepoint ") < 0:
17 return -1;
18 tid = tid[tid.find(", tracepoint ") + len(", tracepoint "):];
19 return int(tid);
This function can get the "num" entry and return the tracepoint
number. If num < 0, will get the next entry.
The tracepoint return value can use to make sure which address we arrive.
If you write a python parse code, you can use this function.
20
21 def print_top(comm, index, num):
22 for e in comm:
23 print e[0], e[1][index];
24 num -= 1;
25 if num <= 0:
26 break;
27
28 def sigint_handler(num, e):
29 print "vfs_read: ", read_clock, "ns", read_num, "times";
30 print "";
31 print "Top 3 of clock is:";
32 print_top (sorted(read_comm.items(), key=lambda d:d[1][0],
reverse=True), 0, 3);
33 print "";
34 print "Top 3 of times is:";
35 print_top (sorted(read_comm.items(), key=lambda d:d[1][1],
reverse=True), 1, 3);
36 print "";
37 print "Top 3 of clock per times is:";
38 print_top (sorted(read_comm.items(), key=lambda d:d[1][2],
reverse=True), 2, 3);
39 print "";
40 print "Top 3 of real clock is:";
41 print_top (sorted(read_comm.items(), key=lambda d:d[1][3],
reverse=True), 3, 3);
42 print "";
43 print "Top 3 of real clock per times is:";
44 print_top (sorted(read_comm.items(), key=lambda d:d[1][4],
reverse=True), 4, 3);
45
46 try:
47 s = raw_input('Continue? (yes)')
48 except:
49 s = 'y'
50 finally:
51 if s[0:1] != 'n' and s[0:1] != 'N':
52 return;
53 exit(1);
This function is the handler of ctrl-c. It output the parse info with
function print_top.
54
55 gdb.execute("set pagination off", True, True);
This is the first line of parser.
The line close pagination then it will not affect the parse. If you
write yourself parser, suggest add this line too.
56
57 #set tracepoint id
58 vfs_read_begin = 1;
59 vfs_read_end = 2;
60 schedule = 3;
61
62 signal.signal(signal.SIGINT, sigint_handler);
63 signal.siginterrupt(signal.SIGINT, False);
Setup the ctrl-c handler and set system call restart when it return.
64
65 #connect to the gdbframe_pipe
66 gdb.execute("list vfs_read");
67 gdb.execute("target tfile /sys/kernel/debug/gtpframe_pipe");
Connect to KGTP interface.
68
69 r_list = {};
70
71 while 1:
72 try:
73 tid = tfind_entry(-1);
74 if tid < 0:
75 raise gdb.error ("Something wrong, drop one entry.");
Get entry from gtpframe_pipe is special. Please see
https://code.google.com/p/kgtp/wiki/HOWTO#Get_the_frame_info_with_GDB
to get info.
76
77 if tid == vfs_read_begin:
78 entry = {};
79 entry["begin_clock"] = long(gdb.parse_and_eval("$clock"));
80 entry["prev_clock"] = entry["begin_clock"];
81 entry["real_clock"] = 0;
82 pid = long(gdb.parse_and_eval("((struct task_struct
*)$current_task)->pid"));
83 r_list[pid] = entry;
This the parser of begin of vfs_read. It will record the entry of a
pid to the r_list.
In GDB python you can access special trace state variables with
function "gdb.parse_and_eval". This function will return a special
type that inside GDB. So need convert it to other type before use it.
84 elif tid == schedule:
85 pid = long(gdb.parse_and_eval("prev->pid"));
86 if pid in r_list:
87 clock = long(gdb.parse_and_eval("$clock"));
88 r_list[pid]["real_clock"] += clock - r_list[pid]["prev_clock"];
89 r_list[pid]["prev_clock"] = clock;
90 else:
91 pid = long(gdb.parse_and_eval("next->pid"));
92 if pid in r_list:
93 r_list[pid]["prev_clock"] = long(gdb.parse_and_eval("$clock"));
This is the parser of schedule.
When the prev pid or next pid is inside r_list, it will set the
real_clock inside r_list.
94 elif tid == vfs_read_end:
95 pid = long(gdb.parse_and_eval("((struct task_struct
*)$current_task)->pid"));
96 if pid not in r_list:
97 raise gdb.error ("cannot find vfs_read_begin for vfs_read_end");
98
99 clock = long(gdb.parse_and_eval("$clock"));
100 r_list[pid]["real_clock"] += clock - r_list[pid]["prev_clock"];
101 clock = clock - r_list[pid]["begin_clock"];
102
103 read_clock += clock;
104 real_read_clock += r_list[pid]["real_clock"];
105 read_num += 1;
106
107 comm = str(gdb.parse_and_eval("((struct task_struct
*)$current_task)->comm"));
108 if comm in read_comm:
109 read_comm[comm][0] += clock;
110 read_comm[comm][1] += 1;
111 read_comm[comm][2] = read_comm[comm][0] / read_comm[comm][1];
112 read_comm[comm][3] += r_list[pid]["real_clock"];
113 read_comm[comm][4] = r_list[pid]["real_clock"] / read_comm[comm][1];
114 else:
115 read_comm[comm] = [clock, 1, clock,
r_list[pid]["real_clock"], r_list[pid]["real_clock"]];
116 del(r_list[pid]);
This is handler of vfs_read end. When get the pid inside r_list,
record it and remove it.
117 else:
118 raise gdb.error ("Trace id error.");
119 except gdb.error, x:
120 print("Drop one entry because", x);
121 except gdb.MemoryError, x:
122 print("Drop one entry because", x);
This is the except handler, we didn't handle all the except because we
want handle ctrl-c.
123
124 gdb.execute("tfind", False, True);
Tell KGTP pipe interface that we need the next entry.

This code have a bug. Because we set circular-trace-buffer on, some
entry will be overwrite. If a vfs_read_end is overwrite but
vfs_read_begin is record in r_list, it will never be removed. r_list
will increase too big.

Thanks,
Hui

Attachment: r.source
Description: Binary data

#!/usr/bin/python

import signal;
import gdb;

real_read_clock = 0;
read_clock = 0;
read_num = 0;
read_comm = {};

def tfind_entry(num):
if num < 0:
tid = gdb.execute("tfind", False, True);
else:
tid = gdb.execute("tfind "+str(num), False, True);
if tid.find(", tracepoint ") < 0:
return -1;
tid = tid[tid.find(", tracepoint ") + len(", tracepoint "):];
return int(tid);

def print_top(comm, index, num):
for e in comm:
print e[0], e[1][index];
num -= 1;
if num <= 0:
break;

def sigint_handler(num, e):
print "vfs_read: ", read_clock, "ns", read_num, "times";
print "";
print "Top 3 of clock is:";
print_top (sorted(read_comm.items(), key=lambda d:d[1][0], reverse=True), 0, 3);
print "";
print "Top 3 of times is:";
print_top (sorted(read_comm.items(), key=lambda d:d[1][1], reverse=True), 1, 3);
print "";
print "Top 3 of clock per times is:";
print_top (sorted(read_comm.items(), key=lambda d:d[1][2], reverse=True), 2, 3);
print "";
print "Top 3 of real clock is:";
print_top (sorted(read_comm.items(), key=lambda d:d[1][3], reverse=True), 3, 3);
print "";
print "Top 3 of real clock per times is:";
print_top (sorted(read_comm.items(), key=lambda d:d[1][4], reverse=True), 4, 3);

try:
s = raw_input('Continue? (yes)')
except:
s = 'y'
finally:
if s[0:1] != 'n' and s[0:1] != 'N':
return;
exit(1);

gdb.execute("set pagination off", True, True);

#set tracepoint id
vfs_read_begin = 1;
vfs_read_end = 2;
schedule = 3;

signal.signal(signal.SIGINT, sigint_handler);
signal.siginterrupt(signal.SIGINT, False);

#connect to the gdbframe_pipe
gdb.execute("list vfs_read");
gdb.execute("target tfile /sys/kernel/debug/gtpframe_pipe");

r_list = {};

while 1:
try:
tid = tfind_entry(-1);
if tid < 0:
raise gdb.error ("Something wrong, drop one entry.");

if tid == vfs_read_begin:
entry = {};
entry["begin_clock"] = long(gdb.parse_and_eval("$clock"));
entry["prev_clock"] = entry["begin_clock"];
entry["real_clock"] = 0;
pid = long(gdb.parse_and_eval("((struct task_struct *)$current_task)->pid"));
r_list[pid] = entry;
elif tid == schedule:
pid = long(gdb.parse_and_eval("prev->pid"));
if pid in r_list:
clock = long(gdb.parse_and_eval("$clock"));
r_list[pid]["real_clock"] += clock - r_list[pid]["prev_clock"];
r_list[pid]["prev_clock"] = clock;
else:
pid = long(gdb.parse_and_eval("next->pid"));
if pid in r_list:
r_list[pid]["prev_clock"] = long(gdb.parse_and_eval("$clock"));
elif tid == vfs_read_end:
pid = long(gdb.parse_and_eval("((struct task_struct *)$current_task)->pid"));
if pid not in r_list:
raise gdb.error ("cannot find vfs_read_begin for vfs_read_end");

clock = long(gdb.parse_and_eval("$clock"));
r_list[pid]["real_clock"] += clock - r_list[pid]["prev_clock"];
clock = clock - r_list[pid]["begin_clock"];

read_clock += clock;
real_read_clock += r_list[pid]["real_clock"];
read_num += 1;

comm = str(gdb.parse_and_eval("((struct task_struct *)$current_task)->comm"));
if comm in read_comm:
read_comm[comm][0] += clock;
read_comm[comm][1] += 1;
read_comm[comm][2] = read_comm[comm][0] / read_comm[comm][1];
read_comm[comm][3] += r_list[pid]["real_clock"];
read_comm[comm][4] = r_list[pid]["real_clock"] / read_comm[comm][1];
else:
read_comm[comm] = [clock, 1, clock, r_list[pid]["real_clock"], r_list[pid]["real_clock"]];
del(r_list[pid]);
else:
raise gdb.error ("Trace id error.");
except gdb.error, x:
print("Drop one entry because", x);
except gdb.MemoryError, x:
print("Drop one entry because", x);

gdb.execute("tfind", False, True);