SystemTap 3.2 release

From: Aaron Merey
Date: Wed Oct 18 2017 - 15:01:12 EST


The SystemTap team announces release 3.2!

Highlights include an early experimental eBPF (extended Berkeley Packet
Filter) backend, new regex engine that supports extraction of matched
expressions, new probe aliases for accepting input from stdin, improved
translator diagnostics, support for new statx syscall and new string
function strpos().


= Where to get it

https://sourceware.org/systemtap/ - our project page
https://sourceware.org/systemtap/ftp/releases/systemtap-3.2.tar.gz
https://koji.fedoraproject.org/koji/packageinfo?packageID=615
git tag release-3.2 (commit 4051c70c9318c837)

There have been over 330 commits since the last release.
There have been over 50 bugs fixed / features added since the last release.


= How to build it

See the README and NEWS files at
https://sourceware.org/git/?p=systemtap.git;a=tree

Further information at https://sourceware.org/systemtap/wiki/


= SystemTap frontend (stap) changes

- SystemTap now includes an eBPF backend. This early experimental backend
does not use kernel modules and instead produces eBPF programs that
are verified by the kernel and executed by an in-kernel virtual machine.
Select this backend with the new stap option '--runtime=bpf'. For example:

stap --runtime=bpf -e \
'probe kernel.function("sys_read") { printf("Hi from stapbpf!\n"); exit() }'

Please see the stapbpf(8) man page for more information.

- "stap -k" build trees in $TMPDIR now also include a preprocessed .i form
of the generated module .c code, for problem diagnostics purposes.

- The translator produces better diagnostics for common/annoying case
of missing debuginfo that blocks use of context $variables.


= SystemTap tapset changes

- The regular expression engine now supports extraction of the matched
string and subexpressions using the matched() tapset function:

if ("regexculpicator" =~ "reg(ex.*p).*r") log(matched(1))
-> exculp

- New probe aliases input.char and input.line allow scripts to access
input from stdin during runtime.

- Support for multiple procfs.write probes per procfs file.

- A new function strpos returns the location of a substring within
another string.

- Support for new statx syscall.

- syscall.execve probes now provide a decoded env_str string vector,
just like the argument vector.

- The task_exe_file() function has been deprecated and replaced by
current_exe_file().

- task_dentry_path() now handles chroot().


= SystemTap sample scripts

All 169 examples can be found at https://sourceware.org/systemtap/examples/

- New Samples:

hugepage_split.stp Log the kernel splitting huge pages in normal pages

hugepage_cow_delays.stp Summarize time doing copy on write for huge pages

hugepage_collapse.stp Log the kernel collapsing normal pages into huge pages

hugepage_clear_delays.stp Summarize time spent clearing huge pages

eventcount.stp Counts specified events. Updated with sort parameters
that can be modified at runtime.

cve-2017-6074.stp historical emergency security band-aid, for
reference/education only


= Examples of tested kernel versions

2.6.18 (RHEL 5 x86 and x86_64)
2.6.32 (RHEL 6 x86 and x86_64)
3.10.0 (RHEL 7 x86_64)
4.11.4 (Fedora 25 x86_64)
4.12.0 (Fedora 26 x86_64)
4.13.5 (Fedora 26 x86_64)
4.14.0-rc4 (Fedora rawhide x86_64)


= Known issues with this release

- The BPF backend is in an early stage of development and lacks
support for a number of features found in the default backend.
See the stapbpf man page for more information.

- Some kernel crashes continue to be reported when a script probes
broad kernel function wildcards. (PR2725)

- An upstream kernel commit #2062afb4f804a put "-fno-var-tracking-assignments"
into KCFLAGS, reducing debuginfo quality which can cause debuginfo failures.
A proposed workaround to this issue exists in:
https://lkml.org/lkml/2014/11/21/505 . Fedora kernels are not affected by
this issue.


= Contributors for this release

Aaron Merey*, Arjun Shankar*, Bernhard M. Wiedemann*, Cody Santing,
Daan Spitz*, David Smith, Frank Ch. Eigler, Guilherme G. Piccoli*,
Jakub Jelinek*, Mark Wielaard, Martin Cermak, Mikael Dubik*, Richard Fontana*,
Richard Henderson*, Ritesh Raj Sarraf*, Ruslan Kuprieiev*, Sandipan Das*,
Saul Wold*, Serhei Makarov, Stan Cox, Tetsuo Handa, Torsten Polle,
Vitaly Mayatskikh*, William Cohen

Special thanks to new contributors, marked with '*' above.
Special thanks to Aaron Merey for drafting these notes.

= Bugs fixed for this release <https://sourceware.org/PR#####>

22278 the nss client code doesn't handle '-I DIR' well
22287 the ioblock.stp tapset needs to be updated for rawhide
20516 "BUG: spinlock recursion on CPU#0" crash on s390x
20734 "sleeping function called from invalid context" bogus
kernel BUG on s390x
22124 RHEL6 ppc64 system crash when running the perf.exp
test case
21887 bpf: exit()
22222 on rawhide, we're getting a "spinlock bad magic" BUG
22155 kernel panic due to NULL vma_cache_p->f_path.dentry
22151 on rhel6 i686, the nettop.stp example script causes a
kernel BUG
22158 on rawhide, we're getting a compile error that
spin_unlock_wait() doesn't exist
15065 regular expressions: subexpression capture support
22117 on 32-bit systems, getting "noncontiguous location for base
fetch" errors
22110 semok/autocast07.stp internal error
22097 the semko/target_addr[23].stp test cases are passing
22109 on rawhide, we're getting errors in the nfsd tapset
22087 implicitptr.exp failing after the bpf merge
22054 on 32-bit systems, buildok/twentynine.stp fails
22066 stap fails to link correctly when not using nss or http
22036 compiler errors when using tapset/linux/ioblock.stp
22031 failure in pthread_stacks.exp
22012 "BUG: scheduling while atomic" when probing syscall.open
22005 @min and @max functions return a wrong value
22008 missing rt_sigreturn return probe hits
21998 receiving lots of "kbuild exited with status: 2" warnings when
using tapset/linux/ip.stp
21996 on 32-bit f26, we're getting dyninst errors when using intptr_t
21984 on 32-bit systems, getting gcc "cast to pointer from integer
of different size" errors
21802 improve the syscall/nd_syscall test case
21917 on rawhide, we're getting a fault in the panic handler
21901 on rawhide, we're getting an "inconsistent lock state"
kernel WARNING
14021 --sysroot=/ not a no-op
21834 on rawhide, the context.exp test case causes a kernel warning
21811 java probe crashes upon null argument
13350 dwarf unwinder_stp_valid_pc_addr() invalid for s390x
21726 on rawhide, the backtrace.exp test case causes a kernel panic
21283 cannot use --dyninst and --remote together
21362 more syscall nesting problems
14923 dyninst: misses events when main thread does not go through
at least one quiesce
16795 utrace_p5.exp leaves stapdyn processes running
15144 occasional (40%) stapdyn spin/hang during sdt.exp
18688 stapdyn hang in testsuite
21463 stap -t --dyninst fail un-cleanly
10712 standardize syscall trace rendering
11206 support stdin script input
21435 convenience groupadd for the make install target
20988 string tapset could use a strstr() type function
20333 merge syscall and nd_syscall tapsets
21363 on rawhide, _struct_sched_attr_u() is failing
21353 syscall nesting in fcntl
21297 support needed for new statx syscall
21238 failed user probing does not return proper error
21255 "make installcheck" Smoke test fails with Fedora
rawhide kernels 4.11.0-0.rc2.git1.1.fc27.x86_64
21190 stap fails to build on rhel6 with --enable-sqlite
where sqlite-3.6 is available