Re: linux-next: Tree for May 26 (RCU stalls)

From: Sedat Dilek
Date: Thu May 26 2011 - 14:31:35 EST


On Thu, May 26, 2011 at 7:31 PM, Paul E. McKenney
<paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> On Thu, May 26, 2011 at 05:48:32PM +0200, Sedat Dilek wrote:
>> On Thu, May 26, 2011 at 8:39 AM, Stephen Rothwell <sfr@xxxxxxxxxxxxxxxx> wrote:
>> > Hi all,
>> >
>> > [The kernel.org mirroring is being slow today]
>> >
>> > Changes since 20110525:
>> >
>> > Linus' tree gained a build failure for which I applied a patch.
>> >
>> > The m68knommu tree lost its conflicts.
>> >
>> > The hwmon-staging lost its conflict.
>> >
>> > The wireless lost its conflict.
>> >
>> > The mmc lost its conflict.
>> >
>> > The dwmw2-iommu tree lost its conflict.
>> >
>> > The kvm tree still had its build failure so I used the version from
>> > next-20110524.
>> >
>> > The namespace lost its conflicts.
>> >
>> > ----------------------------------------------------------------------------
>> >
>>
>> Hi,
>>
>> I see these call-traces on x86 UP machine:
>>
>> [ Â240.268061] INFO: task rcun0:8 blocked for more than 120 seconds.
>> [ Â240.268069] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>> disables this message.
>> [ Â240.268072] rcun0 Â Â Â Â Â D 00000000 Â Â 0 Â Â 8 Â Â Â2 0x00000000
>> [ Â240.268079] Âf6473fb8 00000046 013131b6 00000000 c1461ac0 00000000
>> 00000000 c1461ac0
>> [ Â240.268089] Â00000000 00000000 f645dc70 f645bf60 00000003 f6473f78
>> c102a570 f6473f9c
>> [ Â240.268097] Âc1021476 00000000 f645bf6c 00000001 00000000 00000286
>> f6473f9c c129b35a
>> [ Â240.268106] Call Trace:
>> [ Â240.268121] Â[<c102a570>] ? default_wake_function+0xb/0xd
>> [ Â240.268127] Â[<c1021476>] ? __wake_up_common+0x33/0x5b
>> [ Â240.268134] Â[<c129b35a>] ? _raw_spin_unlock_irqrestore+0xe/0x10
>> [ Â240.268140] Â[<c10234ed>] ? complete+0x34/0x3e
>> [ Â240.268147] Â[<c1074d23>] ? cpumask_weight+0xc/0xc
>> [ Â240.268157] Â[<c1044c97>] kthread+0x53/0x67
>> [ Â240.268162] Â[<c1044c44>] ? kthread_worker_fn+0x111/0x111
>> [ Â240.268169] Â[<c12a123e>] kernel_thread_helper+0x6/0xd
>>
>> dmesg and kernel-config are attached.
>
> Hello, Sedat,
>
> Does the following patch clear things up?
>
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â ÂThanx, Paul
>
> ------------------------------------------------------------------------
>
> rcu: Start RCU kthreads in TASK_INTERRUPTIBLE state
>
> Upon creation, kthreads are in TASK_UNINTERRUPTIBLE state, which can
> result in softlockup warnings. ÂBecause some of RCU's kthreads can
> legitimately be idle indefinitely, start them in TASK_INTERRUPTIBLE
> state in order to avoid those warnings.
>
> Suggested-by: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
> Signed-off-by: Paul E. McKenney <paul.mckenney@xxxxxxxxxx>
> Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
> Tested-by: Yinghai Lu <yinghai@xxxxxxxxxx>
>
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index a1a8bb6..40aab8d 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -1647,6 +1647,7 @@ static int __cpuinit rcu_spawn_one_cpu_kthread(int cpu)
> Â Â Â Âif (IS_ERR(t))
> Â Â Â Â Â Â Â Âreturn PTR_ERR(t);
> Â Â Â Âkthread_bind(t, cpu);
> + Â Â Â set_task_state(t, TASK_INTERRUPTIBLE);
> Â Â Â Âper_cpu(rcu_cpu_kthread_cpu, cpu) = cpu;
> Â Â Â ÂWARN_ON_ONCE(per_cpu(rcu_cpu_kthread_task, cpu) != NULL);
> Â Â Â Âper_cpu(rcu_cpu_kthread_task, cpu) = t;
> @@ -1754,6 +1755,7 @@ static int __cpuinit rcu_spawn_one_node_kthread(struct rcu_state *rsp,
> Â Â Â Â Â Â Â Âif (IS_ERR(t))
> Â Â Â Â Â Â Â Â Â Â Â Âreturn PTR_ERR(t);
> Â Â Â Â Â Â Â Âraw_spin_lock_irqsave(&rnp->lock, flags);
> + Â Â Â Â Â Â Â set_task_state(t, TASK_INTERRUPTIBLE);
> Â Â Â Â Â Â Â Ârnp->node_kthread_task = t;
> Â Â Â Â Â Â Â Âraw_spin_unlock_irqrestore(&rnp->lock, flags);
> Â Â Â Â Â Â Â Âsp.sched_priority = 99;
> diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
> index 049f278..a767b7d 100644
> --- a/kernel/rcutree_plugin.h
> +++ b/kernel/rcutree_plugin.h
> @@ -1295,6 +1295,7 @@ static int __cpuinit rcu_spawn_one_boost_kthread(struct rcu_state *rsp,
> Â Â Â Âif (IS_ERR(t))
> Â Â Â Â Â Â Â Âreturn PTR_ERR(t);
> Â Â Â Âraw_spin_lock_irqsave(&rnp->lock, flags);
> + Â Â Â set_task_state(t, TASK_INTERRUPTIBLE);
> Â Â Â Ârnp->boost_kthread_task = t;
> Â Â Â Âraw_spin_unlock_irqrestore(&rnp->lock, flags);
> Â Â Â Âsp.sched_priority = RCU_KTHREAD_PRIO;
>

Thanks for the quick reply and patch!

On 1st look at dmesg the RCU stalls are gone.
I tested against linux-next (next-20110526).

Feel free to add:

Tested-by: Sedat Dilek <sedat.dilek@xxxxxxxxx>

- Sedat -
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/