Re: linux-next: Tree for April 14 (Call-traces: RCU/ACPI/WQ related?)

From: Sedat Dilek
Date: Tue Apr 26 2011 - 07:45:39 EST


On Tue, Apr 26, 2011 at 7:06 AM, Paul E. McKenney
<paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> On Sun, Apr 24, 2011 at 09:43:31AM -0700, Paul E. McKenney wrote:
>> On Sun, Apr 24, 2011 at 11:36:44AM +0200, Sedat Dilek wrote:
>> > On Sun, Apr 24, 2011 at 8:27 AM, Paul E. McKenney
>> > <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
>>
>> [ . . . ]
>>
>> > > OK, this looks unrelated, but just in case, could you please try it
>> > > again with the following patch? Â(Not mainlinable, debug only.)
>> > >
>> > > Also, it does look like you are still seeing a grace-period hang.
>> > > Could you please send the output of the script? ÂSame one as last time.
>> > >
>> > > Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â ÂThanx, Paul
>> > >
>> > > ------------------------------------------------------------------------
>> > >
>> > > Âdebugobjects.c | Â Â8 +++++---
>> > > Â1 file changed, 5 insertions(+), 3 deletions(-)
>> > >
>> > > diff --git a/lib/debugobjects.c b/lib/debugobjects.c
>> > > index 9d86e45..10a7c7a 100644
>> > > --- a/lib/debugobjects.c
>> > > +++ b/lib/debugobjects.c
>> > > @@ -289,10 +289,12 @@ static void debug_object_is_on_stack(void *addr, int onstack)
>> > > Â Â Â Â Â Â Â Âreturn;
>> > >
>> > > Â Â Â Âlimit++;
>> > > - Â Â Â if (is_on_stack)
>> > > + Â Â Â if (is_on_stack) {
>> > > + Â Â Â Â Â Â Â struct rcu_head *p = (struct rcu_head *)addr;
>> > > Â Â Â Â Â Â Â Âprintk(KERN_WARNING
>> > > - Â Â Â Â Â Â Â Â Â Â Â"ODEBUG: object is on stack, but not annotated\n");
>> > > - Â Â Â else
>> > > + Â Â Â Â Â Â Â Â Â Â Â"ODEBUG: object is on stack, but not annotated: %p\n",
>> > > + Â Â Â Â Â Â Â Â Â Â Âp->func);
>> > > + Â Â Â } else
>> > > Â Â Â Â Â Â Â Âprintk(KERN_WARNING
>> > > Â Â Â Â Â Â Â Â Â Â Â "ODEBUG: object is not on stack, but annotated\n");
>> > > Â Â Â ÂWARN_ON(1);
>> > >
>> >
>> > Somehow your attached patch was not applicable.
>> > As the changes were a few lines I applied it by myself.
>> > Attached are log, dmesg and patches (orig + mine)
>>
>> Hmmm... ÂDoes 0xc10231a1 correspond to a function in your build? ÂIf so,
>> could you please let me know which one?
>>
>> OK, so according to "ps" the per-CPU kthread is runnable, but it appears
>> to never run. ÂYou only have one CPU, so it cannot be waiting due to
>> running on the wrong CPU. ÂThe only other loop is in wait_event(), and
>> that code looks good -- besides, if wait_event() was broken, we would
>> be seeing breakage everywhere.
>>
>> Peter, any thoughts on what I might have done wrong to get the scheduler
>> into a state where it was ignoring a runnable realtime task?
>
> Hello, Sedat,
>
> Here is a diagnostic patch to apply on top of sedat.2011.04.23a from
> the -rcu git tree. ÂCould you please try it out, let me know what
> happens, and run the last collectdebugfs.sh during the test?
>
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â ÂThanx, Paul
>
> ------------------------------------------------------------------------
>
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index 6cf6e47..65ae701 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -1524,9 +1524,9 @@ static void rcu_cpu_kthread_setrt(int cpu, int to_rt)
> Â Â Â Â Â Â Â Âreturn;
> Â Â Â Âif (to_rt) {
> Â Â Â Â Â Â Â Âpolicy = SCHED_NORMAL;
> - Â Â Â Â Â Â Â sp.sched_priority = RCU_KTHREAD_PRIO;
> + Â Â Â Â Â Â Â sp.sched_priority = 0;
> Â Â Â Â} else {
> - Â Â Â Â Â Â Â policy = SCHED_FIFO;
> + Â Â Â Â Â Â Â policy = SCHED_NORMAL;
> Â Â Â Â Â Â Â Âsp.sched_priority = 0;
> Â Â Â Â}
> Â Â Â Âsched_setscheduler_nocheck(t, policy, &sp);
> @@ -1566,8 +1566,8 @@ static void rcu_yield(void (*f)(unsigned long), unsigned long arg)
> Â Â Â Âsp.sched_priority = 0;
> Â Â Â Âsched_setscheduler_nocheck(current, SCHED_NORMAL, &sp);
> Â Â Â Âschedule();
> - Â Â Â sp.sched_priority = RCU_KTHREAD_PRIO;
> - Â Â Â sched_setscheduler_nocheck(current, SCHED_FIFO, &sp);
> + Â Â Â sp.sched_priority = 0;
> + Â Â Â sched_setscheduler_nocheck(current, SCHED_NORMAL, &sp);
> Â Â Â Âdel_timer(&yield_timer);
> Â}
>
> @@ -1671,8 +1671,8 @@ static int __cpuinit rcu_spawn_one_cpu_kthread(int cpu)
> Â Â Â ÂWARN_ON_ONCE(per_cpu(rcu_cpu_kthread_task, cpu) != NULL);
> Â Â Â Âper_cpu(rcu_cpu_kthread_task, cpu) = t;
> Â Â Â Âwake_up_process(t);
> - Â Â Â sp.sched_priority = RCU_KTHREAD_PRIO;
> - Â Â Â sched_setscheduler_nocheck(t, SCHED_FIFO, &sp);
> + Â Â Â sp.sched_priority = 0;
> + Â Â Â sched_setscheduler_nocheck(t, SCHED_NORMAL, &sp);
> Â Â Â Âreturn 0;
> Â}
>
> @@ -1713,8 +1713,8 @@ static int rcu_node_kthread(void *arg)
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Âcontinue;
> Â Â Â Â Â Â Â Â Â Â Â Â}
> Â Â Â Â Â Â Â Â Â Â Â Âper_cpu(rcu_cpu_has_work, cpu) = 1;
> - Â Â Â Â Â Â Â Â Â Â Â sp.sched_priority = RCU_KTHREAD_PRIO;
> - Â Â Â Â Â Â Â Â Â Â Â sched_setscheduler_nocheck(t, SCHED_FIFO, &sp);
> + Â Â Â Â Â Â Â Â Â Â Â sp.sched_priority = 0;
> + Â Â Â Â Â Â Â Â Â Â Â sched_setscheduler_nocheck(t, SCHED_NORMAL, &sp);
> Â Â Â Â Â Â Â Â Â Â Â Âpreempt_enable();
> Â Â Â Â Â Â Â Â}
> Â Â Â Â}
> diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
> index a21413d..baee185 100644
> --- a/kernel/rcutree_plugin.h
> +++ b/kernel/rcutree_plugin.h
> @@ -1307,8 +1307,8 @@ static int __cpuinit rcu_spawn_one_boost_kthread(struct rcu_state *rsp,
> Â Â Â Ârnp->boost_kthread_task = t;
> Â Â Â Âraw_spin_unlock_irqrestore(&rnp->lock, flags);
> Â Â Â Âwake_up_process(t);
> - Â Â Â sp.sched_priority = RCU_KTHREAD_PRIO;
> - Â Â Â sched_setscheduler_nocheck(t, SCHED_FIFO, &sp);
> + Â Â Â sp.sched_priority = 0;
> + Â Â Â sched_setscheduler_nocheck(t, SCHED_NORMAL, &sp);
> Â Â Â Âreturn 0;
> Â}
>
>

Hi Paul,

I have tested with your patch and kept the kernel-config file from
previous tests (don't get confused by the new name).
Hope this helps you.

I have some questions to k-c options espcially X86_UP and
CONFIG_RCU_FANOUT=32 options.
To what extent can they influence our RCU issue?
The below options were not set for this round of testing, but I would
like to have a feedback.
Thanks in advance.

Would these settings be more optimal for a UP-machine?

# CONFIG_SMP is not set
# CONFIG_M486 is not set
CONFIG_M686=y
CONFIG_NR_CPUS=1

CONFIG_X86_UP_APIC=y
CONFIG_X86_UP_IOAPIC=y
CONFIG_HIGHMEM4G=y

Is CONFIG_RCU_FANOUT=32 OK?

With reverting commit 687d7a960aea46e016182c7ce346d62c4dbd0366 ("rcu:
restrict TREE_RCU to SMP builds with !PREEMPT").

Regards,
- Sedat -

Attachment: for-paulk-7.tar.xz
Description: Binary data

Attachment: for-paulk-7.tar.xz.sha256sum
Description: Binary data