AW: Question concerning RCU

From: Stoidner, Christoph
Date: Mon Jan 12 2015 - 06:48:56 EST


Hi Paul,

> You got stack traces with the stall warnings, correct? If so, please look
> at them and at Documentation/RCU/stallwarn.txt and see if the kernel is
> looping somewhere inappropriate.

Yes and no. I have a stack trace, but it is not generated by a stall warning. More
precise: I can never see any stall warning. The reason is that the system freezes
when it is about to output such a warning. Instead the stack trace is generated
by gdb and JTAG hardware debugging, when freezing has occurred.

So I am not sure if there is really a CPU-stall condition or it is just a misrepresented
stall detection. However, outputting a stall warning leads to system freeze. The
warning is never seen.

> I am not familiar with the low-level ARM kernel code, but the stack below
> leads me to suspect that your kernel is interrupting itself to death or
> is improperly handling interrupts.

The stack trace must be read from bottom to top. The repetitive occurrence of
"__irq_svc () at arch/arm/kernel/entry-armv.S:202" on bottom of stack trace is
caused by the stack frame of the interrupt context. This is completely legal and
also the case in normal situations. Instead the problem is on the top of the stack
trace, in function rcu_print_task_stall(). The loop rcutree_plugin.h in line 528
never ends:

static int rcu_print_task_stall(struct rcu_node *rnp)
{
...
...

list_for_each_entry_continue(t, &rnp->blkd_tasks, rcu_node_entry) {
printk(KERN_CONT " P%d", t->pid);
ndetected++;
}

...
...
}

That means list_for_each_entry_continue () never ends since rcu_node_entry.next
seems to point to it-self but not to rnp->blkd_tasks. I have no idea how this can
happen.

One more thing: Just for testing I have now enabled CONFIG_TINY_PREEMPT_RCU.
Until now the problem has not occurred anymore. Do you have any idea what makes
the differences here?

Thanks and regards,
Christoph
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/