Re: kmemleak not tainted

From: Sergey Senozhatsky
Date: Tue Jul 07 2009 - 09:15:44 EST


> On Tue, 2009-07-07 at 14:51 +0300, Sergey Senozhatsky wrote:
> > kernel: [ 1917.133154] INFO: RCU detected CPU 0 stall (t=485140/3000 jiffies)
>
> That's the relevant message. With CONFIG_RCU_CPU_STALL_DETECTOR you may
> get these messages.
>
> > static struct kmemleak_object *find_and_get_object(unsigned long ptr, int alias)
> > {
> > unsigned long flags;
> > struct kmemleak_object *object = NULL;
> >
> > rcu_read_lock();
> > read_lock_irqsave(&kmemleak_lock, flags);
> > if (ptr >= min_addr && ptr < max_addr)
> > object = lookup_object(ptr, alias);
> > >> read_unlock_irqrestore(&kmemleak_lock, flags);
> >
> > /* check whether the object is still available */
> > if (object && !get_object(object))
> > object = NULL;
> > rcu_read_unlock();
> >
> > return object;
> > }
>
> It just happened here because that's where the interrupts were enabled
> and the timer routine invoked. The rcu-locked region above should be
> pretty short (just a tree look-up).
>
> What I think happens is that the kmemleak thread runs for several
> seconds for scanning the memory and there may not be any context
> switches. I have a patch to add more cond_resched() calls throughout the
> kmemleak_scan() function which I hope will get merged.

Cc me please.
Thanks.


> I don't get any of these messages with CONFIG_PREEMPT enabled.
>

It started with rc2-git1 (may be). Almost every scan ends with RCU pending.

[ 1917.133154] INFO: RCU detected CPU 0 stall (t=485140/3000 jiffies)
[ 1917.133154] Pid: 511, comm: kmemleak Not tainted 2.6.31-rc2-nv-git1-00254-ga4e2f0a-dirty #2
[ 1917.133154] Call Trace:
[ 1917.133154] [<c141784d>] ? printk+0x23/0x36
[ 1917.133154] [<c10a81e0>] __rcu_pending+0x140/0x210
[ 1917.133154] [<c10a82da>] rcu_pending+0x2a/0x70
[ 1917.133154] [<c1051b8f>] update_process_times+0x3f/0x80
[ 1917.133154] [<c107148f>] tick_sched_timer+0x6f/0xf0
[ 1917.133154] [<c10640a6>] __run_hrtimer+0x56/0xe0
[ 1917.133154] [<c1071420>] ? tick_sched_timer+0x0/0xf0
[ 1917.133154] [<c1071420>] ? tick_sched_timer+0x0/0xf0
[ 1917.133154] [<c1064aa5>] hrtimer_interrupt+0x145/0x270
[ 1917.133154] [<c101c48c>] smp_apic_timer_interrupt+0x5c/0xb0
[ 1917.133154] [<c12582b8>] ? trace_hardirqs_off_thunk+0xc/0x14
[ 1917.133154] [<c1003e36>] apic_timer_interrupt+0x36/0x3c
[ 1917.133154] [<c141ad61>] ? _read_unlock_irqrestore+0x41/0x70
[ 1917.133154] [<c10f7415>] find_and_get_object+0x75/0xe0
[ 1917.133154] [<c10f73a0>] ? find_and_get_object+0x0/0xe0
[ 1917.133154] [<c10f7577>] scan_block+0x87/0x110
[ 1917.133154] [<c10f7880>] kmemleak_scan+0x280/0x420
[ 1917.133154] [<c10f7600>] ? kmemleak_scan+0x0/0x420
[ 1917.133154] [<c10f80b0>] ? kmemleak_scan_thread+0x0/0xc0
[ 1917.133154] [<c10f8100>] kmemleak_scan_thread+0x50/0xc0
[ 1917.133154] [<c105ff54>] kthread+0x84/0x90
[ 1917.133154] [<c105fed0>] ? kthread+0x0/0x90
[ 1917.133154] [<c100401b>] kernel_thread_helper+0x7/0x1c
[ 1979.742347] kmemleak: 1 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
[ 2589.860586] INFO: RCU detected CPU 1 stall (t=686958/3000 jiffies)
[ 2589.860586] Pid: 511, comm: kmemleak Not tainted 2.6.31-rc2-nv-git1-00254-ga4e2f0a-dirty #2
[ 2589.860586] Call Trace:
[ 2589.860586] [<c141784d>] ? printk+0x23/0x36
[ 2589.860586] [<c10a81e0>] __rcu_pending+0x140/0x210
[ 2589.860586] [<c10a82da>] rcu_pending+0x2a/0x70
[ 2589.860586] [<c1051b8f>] update_process_times+0x3f/0x80
[ 2589.860586] [<c107148f>] tick_sched_timer+0x6f/0xf0
[ 2589.860586] [<c10640a6>] __run_hrtimer+0x56/0xe0
[ 2589.860586] [<c1071420>] ? tick_sched_timer+0x0/0xf0
[ 2589.860586] [<c1071420>] ? tick_sched_timer+0x0/0xf0
[ 2589.860586] [<c1064aa5>] hrtimer_interrupt+0x145/0x270
[ 2589.860586] [<c104b6c8>] ? _local_bh_enable+0x68/0xd0
[ 2589.860586] [<c101c48c>] smp_apic_timer_interrupt+0x5c/0xb0
[ 2589.860586] [<c12582b8>] ? trace_hardirqs_off_thunk+0xc/0x14
[ 2589.860586] [<c1003e36>] apic_timer_interrupt+0x36/0x3c
[ 2589.860586] [<c10795d5>] ? lock_acquire+0xb5/0x120
[ 2589.860586] [<c10f73a0>] ? find_and_get_object+0x0/0xe0
[ 2589.860586] [<c10f73a0>] ? find_and_get_object+0x0/0xe0
[ 2589.860586] [<c10f73eb>] find_and_get_object+0x4b/0xe0
[ 2589.860586] [<c10f73a0>] ? find_and_get_object+0x0/0xe0
[ 2589.860586] [<c10f7577>] scan_block+0x87/0x110
[ 2589.860586] [<c10f7880>] kmemleak_scan+0x280/0x420
[ 2589.860586] [<c10f7600>] ? kmemleak_scan+0x0/0x420
[ 2589.860586] [<c10f80b0>] ? kmemleak_scan_thread+0x0/0xc0
[ 2589.860586] [<c10f8100>] kmemleak_scan_thread+0x50/0xc0
[ 2589.860586] [<c105ff54>] kthread+0x84/0x90
[ 2589.860586] [<c105fed0>] ? kthread+0x0/0x90
[ 2589.860586] [<c100401b>] kernel_thread_helper+0x7/0x1c
[ 3089.007168] r8169: eth1: link down
[ 3245.897188] INFO: RCU detected CPU 1 stall (t=883769/3000 jiffies)
[ 3245.897188] Pid: 511, comm: kmemleak Not tainted 2.6.31-rc2-nv-git1-00254-ga4e2f0a-dirty #2
[ 3245.897188] Call Trace:
[ 3245.897188] [<c141784d>] ? printk+0x23/0x36
[ 3245.897188] [<c10a81e0>] __rcu_pending+0x140/0x210
[ 3245.897188] [<c10a82da>] rcu_pending+0x2a/0x70
[ 3245.897188] [<c1051b8f>] update_process_times+0x3f/0x80
[ 3245.897188] [<c107148f>] tick_sched_timer+0x6f/0xf0
[ 3245.897188] [<c10640a6>] __run_hrtimer+0x56/0xe0
[ 3245.897188] [<c1071420>] ? tick_sched_timer+0x0/0xf0
[ 3245.897188] [<c1071420>] ? tick_sched_timer+0x0/0xf0
[ 3245.897188] [<c1064aa5>] hrtimer_interrupt+0x145/0x270
[ 3245.897188] [<c101c48c>] smp_apic_timer_interrupt+0x5c/0xb0
[ 3245.897188] [<c12582b8>] ? trace_hardirqs_off_thunk+0xc/0x14
[ 3245.897188] [<c1003e36>] apic_timer_interrupt+0x36/0x3c
[ 3245.897188] [<c141ad61>] ? _read_unlock_irqrestore+0x41/0x70
[ 3245.897188] [<c10f745f>] find_and_get_object+0xbf/0xe0
[ 3245.897188] [<c10f73a0>] ? find_and_get_object+0x0/0xe0
[ 3245.897188] [<c10f7577>] scan_block+0x87/0x110
[ 3245.897188] [<c10f7880>] kmemleak_scan+0x280/0x420
[ 3245.897188] [<c10f7600>] ? kmemleak_scan+0x0/0x420
[ 3245.897188] [<c10f80b0>] ? kmemleak_scan_thread+0x0/0xc0
[ 3245.897188] [<c10f8100>] kmemleak_scan_thread+0x50/0xc0
[ 3245.897188] [<c105ff54>] kthread+0x84/0x90
[ 3245.897188] [<c105fed0>] ? kthread+0x0/0x90
[ 3245.897188] [<c100401b>] kernel_thread_helper+0x7/0x1c
[ 3290.435108] kmemleak: 5 new suspected memory leaks (see /sys/kernel/debug/kmemleak)


Hm.. Something is broken...
cat /.../kmemleak
[ 7933.537868] ================================================
[ 7933.537873] [ BUG: lock held when returning to user space! ]
[ 7933.537876] ------------------------------------------------
[ 7933.537880] cat/2897 is leaving the kernel with locks still held!
[ 7933.537884] 1 lock held by cat/2897:
[ 7933.537887] #0: (scan_mutex){+.+.+.}, at: [<c10f717c>] kmemleak_open+0x4c/0x80


> --
> Catalin
>

Sergey

Attachment: signature.asc
Description: Digital signature