Re: 2.6.39-rc4+: oom-killer busy killing tasks

From: Dave Chinner
Date: Wed Apr 27 2011 - 06:28:34 EST


On Wed, Apr 27, 2011 at 12:46:51AM -0700, Christian Kujau wrote:
> On Wed, 27 Apr 2011 at 12:26, Dave Chinner wrote:
> > What this shows is that VFS inode cache memory usage increases until
> > about the 550 sample mark before the VM starts to reclaim it with
> > extreme prejudice. At that point, I'd expect the XFS inode cache to
> > then shrink, and it doesn't. I've got no idea why the either the
>
> Do you remember any XFS changes past 2.6.38 that could be related to
> something like this?

There's plenty of changes that coul dbe the cause - we've changed
the inode reclaim to run in the background out of a workqueue as
well as via the shrinker, so it could even be workqueue starvation
causing the the problem...

hmmmm. Speaking of which - have you changed any of the XFS tunables
in /proc/sys/fs/xfs/ on your machine (specifically
xfssyncd_centisecs)?

> Bisecting is pretty slow on this machine. Could I somehow try to run
> 2.6.39-rc4 but w/o the XFS changes merged after 2.6.38? (Does someone know
> how to do this via git?)

Not easy because there are tree-wide changes that need to be
preserved (e.g. block layer plugging changes) while others around it
would need to be reverted....

> > Can you check if there are any blocked tasks nearing OOM (i.e. "echo
> > w > /proc/sysrq-trigger") so we can see if XFS inode reclaim is
> > stuck somewhere?
>
> Will do, tomorrow.
>
> Should I open a regression bug, so we don't loose track of this thing?

Whatever you want.

Cheers,

Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/