[PATCH] ftrace: update txt document

From: Steven Rostedt
Date: Thu Oct 23 2008 - 19:23:28 EST



A lot of changes have gone into ftrace. This patch updates
the ftrace.txt document.

Signed-off-by: Steven Rostedt <srostedt@xxxxxxxxxx>
---
Documentation/ftrace.txt | 145 +++++++++++++++++++----------------------------
1 file changed, 59 insertions(+), 86 deletions(-)

Index: linux-tip.git/Documentation/ftrace.txt
===================================================================
--- linux-tip.git.orig/Documentation/ftrace.txt 2008-08-14 13:54:58.000000000 -0400
+++ linux-tip.git/Documentation/ftrace.txt 2008-10-23 19:20:30.000000000 -0400
@@ -94,23 +94,19 @@ of ftrace. Here is a list of some of the
only be recorded if the latency is greater than
the value in this file. (in microseconds)

- trace_entries : This sets or displays the number of trace
- entries each CPU buffer can hold. The tracer buffers
- are the same size for each CPU. The displayed number
- is the size of the CPU buffer and not total size. The
+ trace_entries : This sets or displays the number of bytes each CPU
+ buffer can hold. The tracer buffers are the same size
+ for each CPU. The displayed number is the size of the
+ CPU buffer and not total size of all buffers. The
trace buffers are allocated in pages (blocks of memory
that the kernel uses for allocation, usually 4 KB in size).
- Since each entry is smaller than a page, if the last
- allocated page has room for more entries than were
- requested, the rest of the page is used to allocate
- entries.
+ If the last page allocated has room for more bytes
+ than requested, the rest of the page is used to allocate.
+ (Note, the size may not be a multiple of the page size due
+ to buffer managment overhead).

This can only be updated when the current_tracer
- is set to "none".
-
- NOTE: It is planned on changing the allocated buffers
- from being the number of possible CPUS to
- the number of online CPUS.
+ is set to "nop".

tracing_cpumask : This is a mask that lets the user only trace
on specified CPUS. The format is a hex string
@@ -130,13 +126,10 @@ of ftrace. Here is a list of some of the
be traced. If a function exists in both set_ftrace_filter
and set_ftrace_notrace, the function will _not_ be traced.

- available_filter_functions : When a function is encountered the first
- time by the dynamic tracer, it is recorded and
- later the call is converted into a nop. This file
- lists the functions that have been recorded
- by the dynamic tracer and these functions can
- be used to set the ftrace filter by the above
- "set_ftrace_filter" file. (See the section "dynamic ftrace"
+ available_filter_functions : This lists the functions that ftrace
+ has processed and can trace. These are the function
+ names that you can pass to "set_ftrace_filter" or
+ "set_ftrace_notrace". (See the section "dynamic ftrace"
below for more details).


@@ -145,7 +138,7 @@ The Tracers

Here is the list of current tracers that may be configured.

- ftrace - function tracer that uses mcount to trace all functions.
+ function - function tracer that uses mcount to trace all functions.

sched_switch - traces the context switches between tasks.

@@ -166,8 +159,8 @@ Here is the list of current tracers that
the highest priority task to get scheduled after
it has been woken up.

- none - This is not a tracer. To remove all tracers from tracing
- simply echo "none" into current_tracer.
+ nop - This is not a tracer. To remove all tracers from tracing
+ simply echo "nop" into current_tracer.


Examples of using the tracer
@@ -182,7 +175,7 @@ Output format:
Here is an example of the output format of the file "trace"

--------
-# tracer: ftrace
+# tracer: function
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
@@ -192,7 +185,7 @@ Here is an example of the output format
--------

A header is printed with the tracer name that is represented by the trace.
-In this case the tracer is "ftrace". Then a header showing the format. Task
+In this case the tracer is "function". Then a header showing the format. Task
name "bash", the task PID "4251", the CPU that it was running on
"01", the timestamp in <secs>.<usecs> format, the function name that was
traced "path_put" and the parent function that called this function
@@ -1000,22 +993,20 @@ is the stack for the hard interrupt. Thi
has been set. We do not see the 'N' until we switch back to the task's
assigned stack.

-ftrace
-------
+function
+--------

-ftrace is not only the name of the tracing infrastructure, but it
-is also a name of one of the tracers. The tracer is the function
-tracer. Enabling the function tracer can be done from the
-debug file system. Make sure the ftrace_enabled is set otherwise
-this tracer is a nop.
+This tracer is the function tracer. Enabling the function tracer
+can be done from the debug file system. Make sure the ftrace_enabled is
+set otherwise this tracer is a nop.

# sysctl kernel.ftrace_enabled=1
- # echo ftrace > /debug/tracing/current_tracer
+ # echo function > /debug/tracing/current_tracer
# echo 1 > /debug/tracing/tracing_enabled
# usleep 1
# echo 0 > /debug/tracing/tracing_enabled
# cat /debug/tracing/trace
-# tracer: ftrace
+# tracer: function
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
@@ -1037,10 +1028,10 @@ this tracer is a nop.
[...]


-Note: ftrace uses ring buffers to store the above entries. The newest data
-may overwrite the oldest data. Sometimes using echo to stop the trace
-is not sufficient because the tracing could have overwritten the data
-that you wanted to record. For this reason, it is sometimes better to
+Note: function tracer uses ring buffers to store the above entries.
+The newest data may overwrite the oldest data. Sometimes using echo to
+stop the trace is not sufficient because the tracing could have overwritten
+the data that you wanted to record. For this reason, it is sometimes better to
disable tracing directly from a program. This allows you to stop the
tracing at the point that you hit the part that you are interested in.
To disable the tracing directly from a C program, something like following
@@ -1074,18 +1065,30 @@ every kernel function, produced by the -
of pointing to a simple return. (Enabling FTRACE will include the
-pg switch in the compiling of the kernel.)

-When dynamic ftrace is initialized, it calls kstop_machine to make
-the machine act like a uniprocessor so that it can freely modify code
-without worrying about other processors executing that same code. At
-initialization, the mcount calls are changed to call a "record_ip"
-function. After this, the first time a kernel function is called,
-it has the calling address saved in a hash table.
-
-Later on the ftraced kernel thread is awoken and will again call
-kstop_machine if new functions have been recorded. The ftraced thread
-will change all calls to mcount to "nop". Just calling mcount
-and having mcount return has shown a 10% overhead. By converting
-it to a nop, there is no measurable overhead to the system.
+At compile time every C file object is run through the
+recordmcount.pl script (located in the scripts directory). This
+script will process the C object using objdump to find all the
+locations in the .text section that call mcount. (Note, only
+the .text section is processed, since processing other sections
+like .init.text may cause races due to those sections being freed).
+
+A new section called "__mcount_loc" is created that holds references
+to all the mcount call sites in the .text section. This section is
+compiled back into the original object. The final linker will add
+all these references into a single table.
+
+On boot up, before SMP is initialized, the dynamic ftrace code
+scans this table and updates all the locations into nops. It also
+records the locations, which are added to the available_filter_functions
+list. Modules are processed as they are loaded and before they are
+executed. They also remove their functions from the list when they
+are removed.
+
+When tracing is enabled, kstop_machine is called to prevent races
+with the CPUS executing code being modified (which can cause the
+CPU to do undesireable things), and the nops are patched back
+to calls. But this time, they do not call mcount (which is just
+a function stub). They now call into the ftrace infrastructure.

One special side-effect to the recording of the functions being
traced is that we can now selectively choose which functions we
@@ -1248,36 +1251,6 @@ Produces:

We can see that there's no more lock or preempt tracing.

-ftraced
--------
-
-As mentioned above, when dynamic ftrace is configured in, a kernel
-thread wakes up once a second and checks to see if there are mcount
-calls that need to be converted into nops. If there are not any, then
-it simply goes back to sleep. But if there are some, it will call
-kstop_machine to convert the calls to nops.
-
-There may be a case in which you do not want this added latency.
-Perhaps you are doing some audio recording and this activity might
-cause skips in the playback. There is an interface to disable
-and enable the "ftraced" kernel thread.
-
- # echo 0 > /debug/tracing/ftraced_enabled
-
-This will disable the calling of kstop_machine to update the
-mcount calls to nops. Remember that there is a large overhead
-to calling mcount. Without this kernel thread, that overhead will
-exist.
-
-If there are recorded calls to mcount, any write to the ftraced_enabled
-file will cause the kstop_machine to run. This means that a
-user can manually perform the updates when they want to by simply
-echoing a '0' into the ftraced_enabled file.
-
-The updates are also done at the beginning of enabling a tracer
-that uses ftrace function recording.
-
-
trace_pipe
----------

@@ -1286,14 +1259,14 @@ on the tracing is different. Every read
This means that subsequent reads will be different. The trace
is live.

- # echo ftrace > /debug/tracing/current_tracer
+ # echo function > /debug/tracing/current_tracer
# cat /debug/tracing/trace_pipe > /tmp/trace.out &
[1] 4153
# echo 1 > /debug/tracing/tracing_enabled
# usleep 1
# echo 0 > /debug/tracing/tracing_enabled
# cat /debug/tracing/trace
-# tracer: ftrace
+# tracer: function
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
@@ -1314,7 +1287,7 @@ is live.

Note, reading the trace_pipe file will block until more input is added.
By changing the tracer, trace_pipe will issue an EOF. We needed
-to set the ftrace tracer _before_ cating the trace_pipe file.
+to set the function tracer _before_ cating the trace_pipe file.


trace entries
@@ -1331,10 +1304,10 @@ number of entries.
65620

Note, to modify this, you must have tracing completely disabled. To do that,
-echo "none" into the current_tracer. If the current_tracer is not set
-to "none", an EINVAL error will be returned.
+echo "nop" into the current_tracer. If the current_tracer is not set
+to "nop", an EINVAL error will be returned.

- # echo none > /debug/tracing/current_tracer
+ # echo nop > /debug/tracing/current_tracer
# echo 100000 > /debug/tracing/trace_entries
# cat /debug/tracing/trace_entries
100045

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/