Re: [BUG 3.12.rc4] Oops: unable to handle kernel paging requestduring shutdown

From: Linus Torvalds
Date: Fri Oct 25 2013 - 05:02:42 EST


Adding more people, so quoting the whole email for them.

We definitely have some module unload issues. Guys, try the following
a few times to unload modules:

lsmod | grep ' 0 '| cut -d' ' -f1 | xargs sudo rmmod

(a few times because unloading one module will then potentially make
other modules unloadable).

On my machine, I can trigger this, for example:

------------[ cut here ]------------
WARNING: CPU: 0 PID: 3217 at fs/sysfs/file.c:498 sysfs_attr_ns+0x91/0xa0()
sysfs: kobject (null) without dirent
Modules linked in: fuse nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT xt_$
CPU: 0 PID: 3217 Comm: rmmod Not tainted 3.12.0-rc6-00284-ge6036c0b8896 #19
Hardware name: Sony Corporation SVP11213CXB/VAIO, BIOS R0270V7 05/17/2013
0000000000000009 ffff8800aca35df8 ffffffff8160aab5 ffff8800aca35e40
ffff8800aca35e30 ffffffff810514b8 ffffffffa013f080 ffff8801194a6040
0000000000000800 0000000000000000 0000000000c5b3e0 ffff8800aca35e90
Call Trace:
[<ffffffff8160aab5>] dump_stack+0x45/0x56
[<ffffffff810514b8>] warn_slowpath_common+0x78/0xa0
[<ffffffff81051527>] warn_slowpath_fmt+0x47/0x50
[<ffffffff810b5960>] ? module_refcount+0xb0/0xb0
[<ffffffff811e5c61>] sysfs_attr_ns+0x91/0xa0
[<ffffffff811e5d2a>] sysfs_remove_file+0x1a/0x50
[<ffffffff814c88a3>] cpufreq_sysfs_remove_file+0x13/0x30
[<ffffffffa013d350>] acpi_cpufreq_exit+0x2e/0xcde [acpi_cpufreq]
[<ffffffff810b7d1d>] SyS_delete_module+0x15d/0x2c0
[<ffffffff81002929>] ? do_notify_resume+0x59/0x90
[<ffffffff81618f62>] system_call_fastpath+0x16/0x1b
---[ end trace f887112caaa5c4ab ]---

so at least we have a cpufreq/sysfs interaction bug. There may be others.

This particular cpufreq issue may be triggered by the fact that
acpi-cpufreq isn't actually in use (pstate is). Or it might be some
generic cpufreq/sysfs bug. Rafael, Greg, ideas?

I don't see that this particular one would be the one that causes the
timer issues, but it's an example of the fact that module unload tends
to be special and not necessarily well tested.

Linus

On Fri, Oct 25, 2013 at 9:38 AM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> Hmm.. I just got a run_timer_softirq oops on my own laptop, slightly
> different. That was not during shutdown, although there was a "yum
> upgrade" finishing when that happened, so it's quite likely that there
> was a service shutdown (and then restart).
>
> I think it's related. But my oops has almost no information: the IP
> that was jumped to was bogus, and the callchain is just CPU idle
> followed by the softirq -> run_timers_softirq handling, so there's no
> real way to see *what* triggered it.
>
> The bad rip was ffffffffa051e250, which is not a valid code address.
> It *might* be a module address, though. So this might be triggered by
> rmmod on some module that doesn't remove all its timers...
>
> Ideas?
>
> Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/