[3.2-rc3] 100% CPU usage while in del_timer_sync fromiwl3945_rs_free_sta

From: Michal Hocko
Date: Tue Nov 29 2011 - 05:07:34 EST


[I am not sure whether this is ieee80211 or iwl3945 issue so put both
maintainers into loop]

Hi,
I have just noticed that my CPU0 is hogged by del_timer_sync called from
iwl3945_rs_free_sta and it doesn't seem to be able to do any progress.
Two consequent sysrq+t show that:

kworker/u:23 R running 0 2190 2 0x00000000
cdc39dec c046e940 00000000 c06cfa40 c06cfa40 eebbdf1c 00003f38 00000000
00000000 f4df3200 e22e8000 eef8c800 00003f38 cdc39da8 c0156944 00000000
f62029c0 cdc39de4 c014d8bd 00000000 eebbc346 00003f38 00000046 f62025c0
Call Trace:
[<c046e940>] ? __schedule+0x86a/0x893
[<c0156944>] ? tick_program_event+0x24/0x29
[<c014d8bd>] ? hrtimer_interrupt+0x120/0x1b2
[<c012a7dd>] ? get_parent_ip+0xb/0x31
[<c0472b69>] ? add_preempt_count+0x95/0x98
[<c01161d2>] ? smp_apic_timer_interrupt+0x6b/0x7e
[<c0470a0e>] ? apic_timer_interrupt+0x2a/0x30
[<c013d8ff>] ? lock_timer_base.isra.29+0x14/0x41
[<c013d948>] ? try_to_del_timer_sync+0x1c/0xb9
[<c013da16>] ? del_timer_sync+0x31/0x3c
[<f813a431>] ? iwl3945_rs_free_sta+0x10/0x12 [iwl3945]
[<f81c2f24>] ? __sta_info_free.isra.24+0x25/0x38 [mac80211]
[<f81c3cb4>] ? __sta_info_destroy+0x239/0x260 [mac80211]
[<f81c404c>] ? sta_info_flush+0x43/0x6e [mac80211]
[<f81c92e3>] ? ieee80211_set_disassoc+0x1a3/0x1d7 [mac80211]
[<f81c9478>] ? ieee80211_sta_connection_lost+0x31/0x8a [mac80211]
[<c031136f>] ? dev_printk+0x2b/0x2d
[<f81cbb42>] ? ieee80211_sta_work+0x113/0x126 [mac80211]
[<f81ce0f9>] ? ieee80211_iface_work+0x23e/0x250 [mac80211]
[<c0146673>] ? process_one_work+0x19b/0x2e6
[<f81cdebb>] ? ieee80211_netdev_select_queue+0x14/0x14 [mac80211]
[<c0146ba1>] ? worker_thread+0x136/0x1ee
[<c0146a6b>] ? manage_workers.isra.23+0x14f/0x14f
[<c0149eb9>] ? kthread+0x67/0x6c
[<c0149e52>] ? kthread_worker_fn+0x119/0x119
[<c04755b6>] ? kernel_thread_helper+0x6/0x10

and

kworker/u:23 R running 0 2190 2 0x00000000
00000000 0001103c 00000000 00000000 cdc39d7c c0115c2f cdc39d98 c0155b4c
00000001 f62025c0 f62025c0 9f8a8400 00003f62 cdc39da8 c0156944 00000000
f62029c0 cdc39de4 c014d8bd 00000000 9f4d7f93 00003f62 00000046 f62025c0
Call Trace:
[<c0115c2f>] ? lapic_next_event+0x1b/0x1f
[<c0155b4c>] ? clockevents_program_event+0xea/0x108
[<c0156944>] ? tick_program_event+0x24/0x29
[<c014d8bd>] ? hrtimer_interrupt+0x120/0x1b2
[<c012a7dd>] ? get_parent_ip+0xb/0x31
[<c0472ac7>] ? sub_preempt_count+0x81/0x8e
[<c0138913>] ? irq_exit+0x90/0x92
[<c01161d7>] ? smp_apic_timer_interrupt+0x70/0x7e
[<c0470a0e>] ? apic_timer_interrupt+0x2a/0x30
[<c013d8fb>] ? lock_timer_base.isra.29+0x10/0x41
[<c013d948>] ? try_to_del_timer_sync+0x1c/0xb9
[<c013da16>] ? del_timer_sync+0x31/0x3c
[<f813a431>] ? iwl3945_rs_free_sta+0x10/0x12 [iwl3945]
[<f81c2f24>] ? __sta_info_free.isra.24+0x25/0x38 [mac80211]
[<f81c3cb4>] ? __sta_info_destroy+0x239/0x260 [mac80211]
[<f81c404c>] ? sta_info_flush+0x43/0x6e [mac80211]
[<f81c92e3>] ? ieee80211_set_disassoc+0x1a3/0x1d7 [mac80211]
[<f81c9478>] ? ieee80211_sta_connection_lost+0x31/0x8a [mac80211]
[<c031136f>] ? dev_printk+0x2b/0x2d
[<f81cbb42>] ? ieee80211_sta_work+0x113/0x126 [mac80211]
[<f81ce0f9>] ? ieee80211_iface_work+0x23e/0x250 [mac80211]
[<c0146673>] ? process_one_work+0x19b/0x2e6
[<f81cdebb>] ? ieee80211_netdev_select_queue+0x14/0x14 [mac80211]
[<c0146ba1>] ? worker_thread+0x136/0x1ee
[<c0146a6b>] ? manage_workers.isra.23+0x14f/0x14f
[<c0149eb9>] ? kthread+0x67/0x6c
[<c0149e52>] ? kthread_worker_fn+0x119/0x119
[<c04755b6>] ? kernel_thread_helper+0x6/0x10

Looks we are stuck in lock_timer_base loop. I am not familiar with the
code but either base is NULL or we are racing with timer->base changes.
I guess that the first one sounds more probable. The timer in question
is &rs_sta->rate_scale_flush.

The disassoc came after resume from RAM when the network was gone (I
have moved to the office from home). I didn't see this issue with 3.1
kernel and early 3.20rc* were too unstable on my laptop so I had to skip
them.

config is attached.

Let me know if you need any further information.

Thanks
--
Michal Hocko
SUSE Labs
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9
Czech Republic

Attachment: config.gz
Description: Binary data