Re: 2.6.39-rc4+: oom-killer busy killing tasks

From: Christian Kujau
Date: Mon May 02 2011 - 05:26:26 EST


On Sun, 1 May 2011 at 18:01, Dave Chinner wrote:
> I really don't know why the xfs inode cache is not being trimmed. I
> really, really need to know if the XFS inode cache shrinker is
> getting blocked or not running - do you have those sysrq-w traces
> when near OOM I asked for a while back?

Here's another attempt at getting those:

http://nerdbynature.de/bits/2.6.39-rc4/oom/
* messages-11.txt.gz & slabinfo-11.txt.bz2
- oom-killer at 00:05:04
- last sysrq-w to succeed at 00:05:03

* messages-12.txt.gz & slabinfo-12.txt.bz2, along
with meminfo-post-oom-12.txt & sysrq-w_post-oom-12.jpg could
be more interesting:
- last sysrq-w to succeed at 01:27:08
- oom-killer at 01:27:11

...but after the OOM-killer was killing quite a few processes, MemFree
showed 511236 kB free memory, yet ssh logins were still being killed.
Finally I got a root shell on the box, issued sysrq-w again and even
executed /bin/sync, which came back. But looking at the logs now
nothing went to the disk (/var/log resides on / which is a ext4 fs).
See sysrq-w_post-oom-12.jpg for a sysrq-w I took 2381s after boot time,
or 01:32 - syslog stopped on 01:27.

I shall try again with netconsole loggin or something...

HTH & thanks for looking into this,
Christian.
--
BOFH excuse #176:

vapors from evaporating sticky-note adhesives
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/