Re: [Linux 5.2.x] /sys/kernel/debug/tracing/events/power/cpu_idle/id: BUG: kernel NULL pointer dereference, address: 0000000000000000

From: Paul Menzel
Date: Sat Aug 10 2019 - 17:12:01 EST


[+ INTEL IDLE DRIVER]

Dear Linux folks,


On 10.08.19 20:28, Paul Menzel wrote:

On 10.08.19 19:31, Thomas Gleixner wrote:

On Sat, 10 Aug 2019, Paul Menzel wrote:

I have no idea, who to report this to, so I please refer me to the correct
list.

I have no idea yet either :)

With Linux 5.2.7 from Debian Sid/unstable and PowerTOP 2.10, executing

ÂÂÂÂ sudo powertop --auto-tune

causes a NULL pointer dereference, and the graphical session crashes due to an
effect on the i915 driver. It worked in the past with the 4.19 series from
Debian.

Here is the trace, and please find all Linux kernel logs attached.

[ 2027.170589] BUG: kernel NULL pointer dereference, address:
0000000000000000
[ 2027.170600] #PF: supervisor instruction fetch in kernel mode
[ 2027.170604] #PF: error_code(0x0010) - not-present page
[ 2027.170609] PGD 0 P4D 0 [ 2027.170619] Oops: 0010 [#1] SMP PTI
...
[ 2027.170730]Â do_dentry_open+0x13a/0x370

If you have compiled with debug info, please decode the line:

ÂÂ linux/scripts/faddr2line vmlinux do_dentry_open+0x13a/0x370

That gives us the fops pointer which is NULL.

Hah, luckily itâs reproducible.

```
$ scripts/faddr2line /usr/lib/debug/boot/vmlinux-5.2.0-2-amd64 do_dentry_open+0x13a/0x370
do_dentry_open+0x13a/0x370:
do_dentry_open at fs/open.c:799
```

[ 2027.170745]Â path_openat+0x2c6/0x1480
[ 2027.170757]Â ? terminate_walk+0xe6/0x100
[ 2027.170767]Â ? path_lookupat.isra.48+0xa3/0x220
[ 2027.170779]Â ? reuse_swap_page+0x105/0x320
[ 2027.170791]Â do_filp_open+0x93/0x100
[ 2027.170804]Â ? __check_object_size+0x15d/0x189
[ 2027.170816]Â do_sys_open+0x184/0x220
[ 2027.170828]Â do_syscall_64+0x53/0x130
[ 2027.170837]Â entry_SYSCALL_64_after_hwframe+0x44/0xa9

That's an open crashing. We just don't know which file. Is the machine
completely hosed after that or is it just the graphics stuff dying?

No, the graphical login manager showed up, and I could log back in, and continue using hte machine.

If it's not completely dead then instead of running it from your graphical
desktop you could switch to a VGA terminal Alt+Ctrl+F1 (or whatever
function key your distro maps to) after boot and run powertop with strace
from there:

ÂÂ strace -f -o xxx.log powertop

With a bit of luck xxx.log should contain the information about the file it
tries to open.

```
2157Â access("/sys/class/drm/card0/power/rc6_residency_ms", R_OK) = 0
2157Â openat(AT_FDCWD, "/sys/kernel/debug/tracing/events/power/cpu_idle/id", O_RDONLY) = ?
2157Â +++ killed by SIGKILL +++
```

Alternatively if you have a serial console you can enable the
sys_enter_open* tracepoints:

# echo 1 >/sys/kernel/debug/tracing/events/syscalls/sys_enter_open
# echo 1 >/sys/kernel/debug/tracing/events/syscalls/sys_enter_openat

Either add 'ftrace_dump_on_oops' to the kernel command line or enable it
from the shell:

# echo 1 > /proc/sys/kernel/ftrace_dump_on_oops

Then run powertop. After the crash it will take some time to spill out the trace buffer over serial, but it will pinpoint the offending file.

I do not have serial console on this device.

For the record. It is also reproducible with Linux 5.2.6, and trying to print the file contents with cat already fails.

```
$ sudo ls -l /sys/kernel/debug/tracing/events/power/cpu_idle/id
-r--r--r-- 1 root root 0 Aug 10 23:05 /sys/kernel/debug/tracing/events/power/cpu_idle/id
$ sudo cat /sys/kernel/debug/tracing/events/power/cpu_idle/id
Killed
```


Kind regards,

Paul