Re: Ftrace vs perf user page fault statistics differences

From: Steven Rostedt
Date: Tue Jun 13 2017 - 09:06:57 EST


On Mon, 12 Jun 2017 22:05:05 -0400
Will Hawkins <whh8b@xxxxxxxxxxxx> wrote:

> > Can you supply this program. So I can see exactly what it does?
>
> I have attached the binary to this email. I also provided the source
> so you can reproduce the situation directly. You can use the makefile
> provided as long as you have nasm installed.
>
> Although this is a response to a later question, I won't bury the
> lede: The program was written by hand in asm, compiled with nasm to
> object code and then run through gcc to get the output binary. Here is
> how that works:
>
> nasm -felf64 page.s && gcc -nostdlib page.o -o page
>
> The result, as far as ldd is concerned:
>
> hawkinsw@moormans:~/code/one_page_play$ ldd page
> not a dynamic executable

OK, that makes sense.

>
>
> >
> >>
> >> I looked at the results:
> >>
> >> perf report
> >>
> >> and the results were as I expected. There were two page faults for
> >> loading the code into memory and a page fault to
> >> copy_user_enhanced_fast_string invoked by execve's implementation when
> >> loading the binary.
> >
> > What does perf script show you?
>
> Here is what perf report --stdio shows:
>
> # Overhead Command Shared Object Symbol
> # ........ ....... ................. ..................................
> #
> 33.33% page page [.] _start
> |
> --- _start
>
> 33.33% page page [.] target
> |
> --- target
>
> 33.33% page [kernel.kallsyms] [k] copy_user_enhanced_fast_string
> |
> --- copy_user_enhanced_fast_string
> load_elf_binary
> search_binary_handler
> do_execve_common.isra.23
> sys_execve
> stub_execve
> 0x7f51698a31e7
>
>
>
> Here is what perf script shows:
>
> page 28787 113626.587935: page-faults:
> ffffffff81376609 copy_user_enhanced_fast_string ([kernel.kallsyms])
> ffffffff812197bf load_elf_binary ([kernel.kallsyms])
> ffffffff811c76df search_binary_handler ([kernel.kallsyms])
> ffffffff811c8c8e do_execve_common.isra.23 ([kernel.kallsyms])
> ffffffff811c9176 sys_execve ([kernel.kallsyms])
> ffffffff8173afc9 stub_execve ([kernel.kallsyms])
> 7f51698a31e7 [unknown] ([unknown])
>
> page 28787 113626.587961: page-faults:
> 4000e0 _start (/home/hawkinsw/code/one_page_play/page)
>
> page 28787 113626.587968: page-faults:
> 401473 target (/home/hawkinsw/code/one_page_play/page)
>
> Again, this seems like exactly what I would expect.
>
> >
> >>
> >> I decided to run the application under ftrace just for fun. I wanted
> >> an excuse to learn more about it and this seemed like the perfect
> >> chance. I used the incredible trace-cmd suite for the actual
> >> incantation of ftrace. I won't include the actual incantations here
> >> because I used many of them while digging around.
> >
> > what events did you enable?
>
> Sorry for not including this in my first email. I enabled
> exceptions:page_fault_user. I ran the trace-cmd record like this:
>
> sudo ./trace-cmd record -e exceptions:page_fault_user -T --profile -l
> handle_mm_fault ../one_page_play/page

See if things change if you add -F. Perf by default only traces the
application you specify. trace-cmd by default traces everything.

That is, "trace-cmd record -F" is similar to "perf record". Where as
"trace-cmd record" is similar to "perf record -a".

-- Steve