Re: 2.6.39-rc4+: Kernel leaking memory during FS scanning,regression?

From: Paul E. McKenney
Date: Mon Apr 25 2011 - 15:16:21 EST


On Mon, Apr 25, 2011 at 08:36:06PM +0200, Bruno Prémont wrote:
> On Mon, 25 April 2011 Linus Torvalds wrote:
> > On Mon, Apr 25, 2011 at 10:00 AM, Bruno Prémont wrote:
> > >
> > > I hope tiny-rcu is not that broken... as it would mean driving any
> > > PREEMPT_NONE or PREEMPT_VOLUNTARY system out of memory when compiling
> > > packages (and probably also just unpacking larger tarballs or running
> > > things like du).
> >
> > I'm sure that TINYRCU can be fixed if it really is the problem.
> >
> > So I just want to make sure that we know what the root cause of your
> > problem is. It's quite possible that it _is_ a real leak of filp or
> > something, but before possibly wasting time trying to figure that out,
> > let's see if your config is to blame.
>
> With changed config (PREEMPT=y, TREE_PREEMPT_RCU=y) I haven't reproduced
> yet.
>
> When I was reproducing with TINYRCU things went normally for some time
> until suddenly slabs stopped being freed.

Hmmm... If the system is responsive during this time, could you please
do the following after the slabs stop being freed?

ps -eo pid,class,sched,rtprio,stat,state,sgi_p,cpu_time,cmd | grep '\[rcu'

Thanx, Paul

> > > And with system doing nothing (except monitoring itself) memory usage
> > > goes increasing all the time until it starves (well it seems to keep
> > > ~20M free, pushing processes it can to swap). Config is just being
> > > make oldconfig from working 2.6.38 kernel (answering default for new
> > > options)
> >
> > How sure are you that the system really is idle? Quite frankly, the
> > constant growing doesn't really look idle to me.
>
> Except the SIGSTOPed build there is not much left, collectd running in
> background (it polls /proc for process counts, fork rate, memory usage,
> ... opening, reading, closing the files -- scanning every 10 seconds),
> slabtop on one terminal.
>
> CPU activity was near-zero with 10%-20% spikes of system use every 10
> minutes and io-wait when all cache had been pushed out.
>
> > > Attached graph matching numbers of previous mail. (dropping caches was at
> > > 17:55, system idle since then)
> >
> > Nothing at all going on in 'ps' during that time? And what does
> > slabinfo say at that point now that kmemleak isn't dominating
> > everything else?
>
> ps definitely does not show anything special, 30 or so userspace processes.
> Didn't check ls /proc/*/fd though. Will do at next occurrence.
>
>
> Going to test further with various PREEMPT and RCU selections. Will report
> back as I progress (but won't have much time tomorrow).
>
> Bruno
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/