Re: 4.4: INFO: rcu_sched self-detected stall on CPU

From: Boris Ostrovsky
Date: Tue Mar 29 2016 - 10:15:16 EST


On 03/29/2016 04:56 AM, Steven Haigh wrote:

Interestingly enough, this just happened again - but on a different
virtual machine. I'm starting to wonder if this may have something to do
with the uptime of the machine - as the system that this seems to happen
to is always different.

Destroying it and monitoring it again has so far come up blank.

I've thrown the latest lot of kernel messages here:
http://paste.fedoraproject.org/346802/59241532

Would be good to see full console log. The one that you posted starts with an error so I wonder what was before that.

Have you tried this on bare metal, BTW? And you said this is only observed on 4.4, not 4.5, right?


Interestingly, around the same time, /var/log/messages on the remote
syslog server shows:
Mar 29 17:00:01 zeus systemd: Created slice user-0.slice.
Mar 29 17:00:01 zeus systemd: Starting user-0.slice.
Mar 29 17:00:01 zeus systemd: Started Session 1567 of user root.
Mar 29 17:00:01 zeus systemd: Starting Session 1567 of user root.
Mar 29 17:00:01 zeus systemd: Removed slice user-0.slice.
Mar 29 17:00:01 zeus systemd: Stopping user-0.slice.
Mar 29 17:01:01 zeus systemd: Created slice user-0.slice.
Mar 29 17:01:01 zeus systemd: Starting user-0.slice.
Mar 29 17:01:01 zeus systemd: Started Session 1568 of user root.
Mar 29 17:01:01 zeus systemd: Starting Session 1568 of user root.
Mar 29 17:08:34 zeus ntpdate[18569]: adjust time server 203.56.246.94
offset -0.002247 sec
Mar 29 17:08:34 zeus systemd: Removed slice user-0.slice.
Mar 29 17:08:34 zeus systemd: Stopping user-0.slice.
Mar 29 17:10:01 zeus systemd: Created slice user-0.slice.
Mar 29 17:10:01 zeus systemd: Starting user-0.slice.
Mar 29 17:10:01 zeus systemd: Started Session 1569 of user root.
Mar 29 17:10:01 zeus systemd: Starting Session 1569 of user root.
Mar 29 17:10:01 zeus systemd: Removed slice user-0.slice.
Mar 29 17:10:01 zeus systemd: Stopping user-0.slice.
Mar 29 17:20:01 zeus systemd: Created slice user-0.slice.
Mar 29 17:20:01 zeus systemd: Starting user-0.slice.
Mar 29 17:20:01 zeus systemd: Started Session 1570 of user root.
Mar 29 17:20:01 zeus systemd: Starting Session 1570 of user root.
Mar 29 17:20:01 zeus systemd: Removed slice user-0.slice.
Mar 29 17:20:01 zeus systemd: Stopping user-0.slice.
Mar 29 17:30:55 zeus systemd: systemd-logind.service watchdog timeout
(limit 1min)!
Mar 29 17:32:25 zeus systemd: systemd-logind.service stop-sigabrt timed
out. Terminating.
Mar 29 17:33:56 zeus systemd: systemd-logind.service stop-sigterm timed
out. Killing.
Mar 29 17:35:26 zeus systemd: systemd-logind.service still around after
SIGKILL. Ignoring.
Mar 29 17:36:56 zeus systemd: systemd-logind.service stop-final-sigterm
timed out. Killing.
Mar 29 17:38:26 zeus systemd: systemd-logind.service still around after
final SIGKILL. Entering failed mode.
Mar 29 17:38:26 zeus systemd: Unit systemd-logind.service entered failed
state.
Mar 29 17:38:26 zeus systemd: systemd-logind.service failed.


These may be result of your system not feeling well, which is not surprising.

-boris