Re: 2.6.39-rc4+: Kernel leaking memory during FS scanning,regression?

From: Bruno PrÃmont
Date: Mon Apr 25 2011 - 07:42:04 EST


On Mon, 25 April 2011 Bruno PrÃmont wrote:
> On Mon, 25 April 2011 Pekka Enberg wrote:
> > On Mon, Apr 25, 2011 at 12:17 PM, Bruno PrÃmont wrote:
> > > On Mon, 25 April 2011 Mike Frysinger wrote:
> > >> On Sun, Apr 24, 2011 at 22:42, KOSAKI Motohiro wrote:
> > >> >> On Sun, 24 April 2011 Bruno PrÃmont wrote:
> > >> >> > On an older system I've been running Gentoo's revdep-rebuild to check
> > >> >> > for system linking/*.la consistency and after doing most of the work the
> > >> >> > system starved more or less, just complaining about stuck tasks now and
> > >> >> > then.
> > >> >> > Memory usage graph as seen from userspace showed sudden quick increase of
> > >> >> > memory usage though only a very few MB were swapped out (c.f. attached RRD
> > >> >> > graph).
> > >> >>
> > >> >> Seems I've hit it once again (though detected before system was fully
> > >> >> stalled by trying to reclaim memory without success).
> > >> >>
> > >> >> This time it was during simple compiling...
> > >> >> Gathered info below:
> > >> >>
> > >> >> /proc/meminfo:
> > >> >> MemTotal: Â Â Â Â 480660 kB
> > >> >> MemFree: Â Â Â Â Â 64948 kB
> > >> >> Buffers: Â Â Â Â Â 10304 kB
> > >> >> Cached: Â Â Â Â Â Â 6924 kB
> > >> >> SwapCached: Â Â Â Â 4220 kB
> > >> >> Active: Â Â Â Â Â Â11100 kB
> > >> >> Inactive: Â Â Â Â Â15732 kB
> > >> >> Active(anon): Â Â Â 4732 kB
> > >> >> Inactive(anon): Â Â 4876 kB
> > >> >> Active(file): Â Â Â 6368 kB
> > >> >> Inactive(file): Â Â10856 kB
> > >> >> Unevictable: Â Â Â Â Â32 kB
> > >> >> Mlocked: Â Â Â Â Â Â Â32 kB
> > >> >> SwapTotal: Â Â Â Â524284 kB
> > >> >> SwapFree: Â Â Â Â 456432 kB
> > >> >> Dirty: Â Â Â Â Â Â Â Â80 kB
> > >> >> Writeback: Â Â Â Â Â Â 0 kB
> > >> >> AnonPages: Â Â Â Â Â6268 kB
> > >> >> Mapped: Â Â Â Â Â Â 2604 kB
> > >> >> Shmem: Â Â Â Â Â Â Â Â 4 kB
> > >> >> Slab: Â Â Â Â Â Â 250632 kB
> > >> >> SReclaimable: Â Â Â51144 kB
> > >> >> SUnreclaim: Â Â Â 199488 kB Â <--- look big as well...
> > >> >> KernelStack: Â Â Â131032 kB Â <--- what???
> > >> >
> > >> > KernelStack is used 8K bytes per thread. then, your system should have
> > >> > 16000 threads. but your ps only showed about 80 processes.
> > >> > Hmm... stack leak?
> > >>
> > >> i might have a similar report for 2.6.39-rc4 (seems to be working fine
> > >> in 2.6.38.4), but for embedded Blackfin systems running gdbserver
> > >> processes over and over (so lots of short lived forks)
> > >>
> > >> i wonder if you have a lot of zombies or otherwise unclaimed resources
> > >> ? Âdoes `ps aux` show anything unusual ?
> > >
> > > I've not seen anything special (no big amount of threads behind my about 80
> > > processes, even after kernel oom-killed nearly all processes the hogged
> > > memory has not been freed. And no, there are no zombies around).
> > >
> > > Here it seems to happened when I run 2 intensive tasks in parallel, e.g.
> > > (re)emerging gimp and running revdep-rebuild -pi in another terminal.
> > > This produces a fork rate of about 100-300 per second.
> > >
> > > Suddenly kmalloc-128 slabs stop being freed and things degrade.
> > >
> > > Trying to trace some of the kmalloc-128 slab allocations I end up seeing
> > > lots of allocations like this:
> > >
> > > [ 1338.554429] TRACE kmalloc-128 alloc 0xc294ff00 inuse=30 fp=0xc294ff00
> > > [ 1338.554434] Pid: 1573, comm: collectd Tainted: G Â Â Â ÂW Â 2.6.39-rc4-jupiter-00187-g686c4cb #1
> > > [ 1338.554437] Call Trace:
> > > [ 1338.554442] Â[<c10aef47>] trace+0x57/0xa0
> > > [ 1338.554447] Â[<c10b07b3>] alloc_debug_processing+0xf3/0x140
> > > [ 1338.554452] Â[<c10b0972>] T.999+0x172/0x1a0
> > > [ 1338.554455] Â[<c10b95d8>] ? get_empty_filp+0x58/0xc0
> > > [ 1338.554459] Â[<c10b95d8>] ? get_empty_filp+0x58/0xc0
> > > [ 1338.554464] Â[<c10b0a52>] kmem_cache_alloc+0xb2/0x100
> > > [ 1338.554468] Â[<c10c08b5>] ? path_put+0x15/0x20
> > > [ 1338.554472] Â[<c10b95d8>] ? get_empty_filp+0x58/0xc0
> > > [ 1338.554476] Â[<c10b95d8>] get_empty_filp+0x58/0xc0
> > > [ 1338.554481] Â[<c10c323f>] path_openat+0x1f/0x320
> > > [ 1338.554485] Â[<c10a0a4e>] ? __access_remote_vm+0x19e/0x1d0
> > > [ 1338.554490] Â[<c10c3620>] do_filp_open+0x30/0x80
> > > [ 1338.554495] Â[<c10b0a30>] ? kmem_cache_alloc+0x90/0x100
> > > [ 1338.554500] Â[<c10c16f8>] ? getname_flags+0x28/0xe0
> > > [ 1338.554505] Â[<c10cd522>] ? alloc_fd+0x62/0xe0
> > > [ 1338.554509] Â[<c10c1731>] ? getname_flags+0x61/0xe0
> > > [ 1338.554514] Â[<c10b781d>] do_sys_open+0xed/0x1e0
> > > [ 1338.554519] Â[<c10b7979>] sys_open+0x29/0x40
> > > [ 1338.554524] Â[<c1391390>] sysenter_do_call+0x12/0x26
> > > [ 1338.556764] TRACE kmalloc-128 alloc 0xc294ff80 inuse=31 fp=0xc294ff80
> > > [ 1338.556774] Pid: 1332, comm: bash Tainted: G Â Â Â ÂW Â 2.6.39-rc4-jupiter-00187-g686c4cb #1
> > > [ 1338.556779] Call Trace:
> > > [ 1338.556794] Â[<c10aef47>] trace+0x57/0xa0
> > > [ 1338.556802] Â[<c10b07b3>] alloc_debug_processing+0xf3/0x140
> > > [ 1338.556807] Â[<c10b0972>] T.999+0x172/0x1a0
> > > [ 1338.556812] Â[<c10b95d8>] ? get_empty_filp+0x58/0xc0
> > > [ 1338.556817] Â[<c10b95d8>] ? get_empty_filp+0x58/0xc0
> > > [ 1338.556821] Â[<c10b0a52>] kmem_cache_alloc+0xb2/0x100
> > > [ 1338.556826] Â[<c10b95d8>] ? get_empty_filp+0x58/0xc0
> > > [ 1338.556830] Â[<c10b95d8>] get_empty_filp+0x58/0xc0
> > > [ 1338.556841] Â[<c121fca8>] ? tty_ldisc_deref+0x8/0x10
> > > [ 1338.556849] Â[<c10c323f>] path_openat+0x1f/0x320
> > > [ 1338.556857] Â[<c11e2b3e>] ? fbcon_cursor+0xfe/0x180
> > > [ 1338.556863] Â[<c10c3620>] do_filp_open+0x30/0x80
> > > [ 1338.556868] Â[<c10b0a30>] ? kmem_cache_alloc+0x90/0x100
> > > [ 1338.556873] Â[<c10c5e8e>] ? do_vfs_ioctl+0x7e/0x580
> > > [ 1338.556878] Â[<c10c16f8>] ? getname_flags+0x28/0xe0
> > > [ 1338.556886] Â[<c10cd522>] ? alloc_fd+0x62/0xe0
> > > [ 1338.556891] Â[<c10c1731>] ? getname_flags+0x61/0xe0
> > > [ 1338.556898] Â[<c10b781d>] do_sys_open+0xed/0x1e0
> > > [ 1338.556903] Â[<c10b7979>] sys_open+0x29/0x40
> > > [ 1338.556913] Â[<c1391390>] sysenter_do_call+0x12/0x26
> > >
> > > Collectd is system monitoring daemon that counts processes, memory
> > > usage an much more, reading lots of files under /proc every 10
> > > seconds.
> > > Maybe it opens a process related file at a racy moment and thus
> > > prevents the 128 slabs and kernel stacks from being released?
> > >
> > > Replaying the scenario I'm at:
> > > Slab: Â Â Â Â Â Â Â43112 kB
> > > SReclaimable: Â Â Â25396 kB
> > > SUnreclaim: Â Â Â Â17716 kB
> > > KernelStack: Â Â Â 16432 kB
> > > PageTables: Â Â Â Â 1320 kB
> > >
> > > with
> > > kmalloc-256      55   64  Â256  16  Â1 : tunables  Â0  Â0  Â0 : slabdata   Â4   Â4   Â0
> > > kmalloc-128    Â66656 Â66656  Â128  32  Â1 : tunables  Â0  Â0  Â0 : slabdata  2083  2083   Â0
> > > kmalloc-64     Â3902  3904   64  64  Â1 : tunables  Â0  Â0  Â0 : slabdata   61   61   Â0
> > >
> > > (and compiling process tree now SIGSTOPped in order to have system
> > > not starve immediately so I can look around for information)
> > >
> > > If I resume one of the compiling process trees both KernelStack and
> > > slab (kmalloc-128) usage increase quite quickly (and seems to never
> > > get down anymore) - probably at same rate as processes get born (no
> > > matter when they end).
> >
> > Looks like it might be a leak in VFS. You could try kmemleak to narrow
> > it down some more. See Documentation/kmemleak.txt for details.
>
> Hm, seems not to be willing to let me run kmemleak... each time I put
> on my load scenario I get "BUG: unable to handle kernel " on console
> as a last breath from the system. (the rest of the trace never shows up)
>
> Going to try harder to get at least a complete trace...

After many attempts I got something from kmemleak (running on VESAfb
instead of vgacon or nouveau KMS), netconsole disabled.
For the crashes my screen is just too small to display the interesting
part of it (maybe I can get it via serial console at a later attempt)

What kmemcheck finds does look very repetitive:
unreferenced object 0xdd294540 (size 328):
comm "collectd", pid 1541, jiffies 4294940278 (age 699.510s)
hex dump (first 32 bytes):
40 57 d2 dc 00 00 00 00 00 00 00 00 00 00 00 00 @W..............
00 00 00 00 00 00 00 00 6d 41 00 00 00 00 00 00 ........mA......
backtrace:
[<c138aae7>] kmemleak_alloc+0x27/0x50
[<c10b0b28>] kmem_cache_alloc+0x88/0x120
[<c10f452e>] proc_alloc_inode+0x1e/0x90
[<c10cd0ec>] alloc_inode+0x1c/0x80
[<c10cd162>] new_inode+0x12/0x40
[<c10f54bc>] proc_pid_make_inode+0xc/0xa0
[<c10f6835>] proc_pident_instantiate+0x15/0xa0
[<c10f693d>] proc_pident_lookup+0x7d/0xb0
[<c10f69a7>] proc_tgid_base_lookup+0x17/0x20
[<c10c1f52>] d_alloc_and_lookup+0x32/0x60
[<c10c23b4>] do_lookup+0xa4/0x250
[<c10c3653>] do_last+0xe3/0x700
[<c10c4882>] path_openat+0x92/0x320
[<c10c4bf0>] do_filp_open+0x30/0x80
[<c10b8ded>] do_sys_open+0xed/0x1e0
[<c10b8f49>] sys_open+0x29/0x40
unreferenced object 0xdd0fa180 (size 128):
comm "collectd", pid 1541, jiffies 4294940278 (age 699.510s)
hex dump (first 32 bytes):
1c c0 00 00 04 00 00 00 00 00 00 00 00 02 20 00 .............. .
00 5e 24 dd 65 f6 12 00 03 00 00 00 a4 a1 0f dd .^$.e...........
backtrace:
[<c138aae7>] kmemleak_alloc+0x27/0x50
[<c10b0b28>] kmem_cache_alloc+0x88/0x120
[<c10cb95e>] d_alloc+0x1e/0x180
[<c10f5027>] proc_fill_cache+0xd7/0x140
[<c10f7b27>] proc_task_readdir+0x237/0x300
[<c10c7cf4>] vfs_readdir+0x84/0xa0
[<c10c7d74>] sys_getdents64+0x64/0xb0
[<c13945d0>] sysenter_do_call+0x12/0x26
[<ffffffff>] 0xffffffff
unreferenced object 0xdd294690 (size 328):
comm "collectd", pid 1541, jiffies 4294940278 (age 699.510s)
hex dump (first 32 bytes):
40 57 d2 dc 00 00 00 00 00 00 00 00 00 00 00 00 @W..............
00 00 00 00 00 00 00 00 6d 41 00 00 00 00 00 00 ........mA......
backtrace:
[<c138aae7>] kmemleak_alloc+0x27/0x50
[<c10b0b28>] kmem_cache_alloc+0x88/0x120
[<c10f452e>] proc_alloc_inode+0x1e/0x90
[<c10cd0ec>] alloc_inode+0x1c/0x80
[<c10cd162>] new_inode+0x12/0x40
[<c10f54bc>] proc_pid_make_inode+0xc/0xa0
[<c10f6791>] proc_task_instantiate+0x11/0xa0
[<c10f506d>] proc_fill_cache+0x11d/0x140
[<c10f7b27>] proc_task_readdir+0x237/0x300
[<c10c7cf4>] vfs_readdir+0x84/0xa0
[<c10c7d74>] sys_getdents64+0x64/0xb0
[<c13945d0>] sysenter_do_call+0x12/0x26
[<ffffffff>] 0xffffffff
unreferenced object 0xdd22df80 (size 128):
comm "collectd", pid 1541, jiffies 4294940278 (age 699.510s)
hex dump (first 32 bytes):
1c c0 00 00 04 00 00 00 00 00 00 00 00 02 20 00 .............. .
80 2c 13 dd 23 c5 6f d6 06 00 00 00 a4 df 22 dd .,..#.o.......".
backtrace:
[<c138aae7>] kmemleak_alloc+0x27/0x50
[<c10b0b28>] kmem_cache_alloc+0x88/0x120
[<c10cb95e>] d_alloc+0x1e/0x180
[<c10c1f40>] d_alloc_and_lookup+0x20/0x60
[<c10c23b4>] do_lookup+0xa4/0x250
[<c10c3653>] do_last+0xe3/0x700
[<c10c4882>] path_openat+0x92/0x320
[<c10c4bf0>] do_filp_open+0x30/0x80
[<c10b8ded>] do_sys_open+0xed/0x1e0
[<c10b8f49>] sys_open+0x29/0x40
[<c13945d0>] sysenter_do_call+0x12/0x26
[<ffffffff>] 0xffffffff

All I could fetch from the system (300k, expands to ~16MB
for some portion of announced 6k entries):
http://homepage.internet.lu/BrunoP/jupiter.kmemleak.bz2

Bruno
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/