Re: rcu: endless stalls

From: Mike Galbraith
Date: Thu Jun 14 2012 - 03:45:32 EST


On Wed, 2012-06-13 at 09:12 +0200, Mike Galbraith wrote:
> On Wed, 2012-06-13 at 07:56 +0200, Mike Galbraith wrote:
>
> > Question remains though. Maybe the box hit some other problem that led
> > to death by RCU gripage, but the info I received indicated the box was
> > in the midst of a major spin-fest.
>
> To (maybe) speak more clearly, since it's a mutex like any other mutex
> that loads of CPUs can hit if you've got loads of CPUs, did huge box
> driver do something that we don't expect so many CPUs to be doing, thus
> instigate simultaneous exit trouble (ie shoot self in foot), or did that
> mutex addition create the exit trouble which box appeared to be having?

Crickets chirping.. I know what _that_ means: "tsk tsk, you dummy" :)

I suspected that would happen, but asked anyway because I couldn't
imagine even 4096 CPUs getting tangled up for an _eternity_ trying to go
to sleep, but the lock which landed after 32-stable where these beasts
earn their daily fuel rods was splattered all over the event. Oh well.

So, I can forget that and just make the thing not gripe itself to death
should a stall for whatever reason be encountered again.

Rather than mucking about with rcu_cpu_stall_suppress, how about adjust
timeout as you proceed, and block report functions? That way, there's
no fiddling with things used elsewhere, and it shouldn't matter how
badly console is being be hammered, you get a full report, and maybe
even only one.

Hm, maybe I should forget hoping to keep check_cpu_stall() happy too,
and only silently ignore it when busy.

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 0da7b88..e9dd654 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -727,24 +727,29 @@ static void record_gp_stall_check_time(struct rcu_state *rsp)
rsp->jiffies_stall = jiffies + jiffies_till_stall_check();
}

+int rcu_stall_report_in_progress;
+
static void print_other_cpu_stall(struct rcu_state *rsp)
{
int cpu;
long delta;
unsigned long flags;
int ndetected;
- struct rcu_node *rnp = rcu_get_root(rsp);
+ struct rcu_node *root = rcu_get_root(rsp);
+ struct rcu_node *rnp;

/* Only let one CPU complain about others per time interval. */

- raw_spin_lock_irqsave(&rnp->lock, flags);
+ raw_spin_lock_irqsave(&root->lock, flags);
delta = jiffies - rsp->jiffies_stall;
- if (delta < RCU_STALL_RAT_DELAY || !rcu_gp_in_progress(rsp)) {
- raw_spin_unlock_irqrestore(&rnp->lock, flags);
+ if (delta < RCU_STALL_RAT_DELAY || !rcu_gp_in_progress(rsp) ||
+ rcu_stall_report_in_progress) {
+ raw_spin_unlock_irqrestore(&root->lock, flags);
return;
}
rsp->jiffies_stall = jiffies + 3 * jiffies_till_stall_check() + 3;
- raw_spin_unlock_irqrestore(&rnp->lock, flags);
+ rcu_stall_report_in_progress++;
+ raw_spin_unlock_irqrestore(&root->lock, flags);

/*
* OK, time to rat on our buddy...
@@ -765,16 +770,23 @@ static void print_other_cpu_stall(struct rcu_state *rsp)
print_cpu_stall_info(rsp, rnp->grplo + cpu);
ndetected++;
}
+
+ /*
+ * Push the timeout back as we go. With a slow serial
+ * console on a large machine, this may take a while.
+ */
+ raw_spin_lock_irqsave(&root->lock, flags);
+ rsp->jiffies_stall = jiffies + 3 * jiffies_till_stall_check() + 3;
+ raw_spin_unlock_irqrestore(&root->lock, flags);
}

/*
* Now rat on any tasks that got kicked up to the root rcu_node
* due to CPU offlining.
*/
- rnp = rcu_get_root(rsp);
- raw_spin_lock_irqsave(&rnp->lock, flags);
- ndetected = rcu_print_task_stall(rnp);
- raw_spin_unlock_irqrestore(&rnp->lock, flags);
+ raw_spin_lock_irqsave(&root->lock, flags);
+ ndetected = rcu_print_task_stall(root);
+ raw_spin_unlock_irqrestore(&root->lock, flags);

print_cpu_stall_info_end();
printk(KERN_CONT "(detected by %d, t=%ld jiffies)\n",
@@ -784,6 +796,10 @@ static void print_other_cpu_stall(struct rcu_state *rsp)
else if (!trigger_all_cpu_backtrace())
dump_stack();

+ raw_spin_lock_irqsave(&root->lock, flags);
+ rcu_stall_report_in_progress--;
+ raw_spin_unlock_irqrestore(&root->lock, flags);
+
/* If so configured, complain about tasks blocking the grace period. */

rcu_print_detail_task_stall(rsp);
@@ -796,6 +812,17 @@ static void print_cpu_stall(struct rcu_state *rsp)
unsigned long flags;
struct rcu_node *rnp = rcu_get_root(rsp);

+ raw_spin_lock_irqsave(&rnp->lock, flags);
+ if (rcu_stall_report_in_progress) {
+ raw_spin_unlock_irqrestore(&rnp->lock, flags);
+ return;
+ }
+
+ /* Reset timeout, dump_stack() may take a while on large machines. */
+ rsp->jiffies_stall = jiffies + 3 * jiffies_till_stall_check() + 3;
+ rcu_stall_report_in_progress++;
+ raw_spin_unlock_irqrestore(&rnp->lock, flags);
+
/*
* OK, time to rat on ourselves...
* See Documentation/RCU/stallwarn.txt for info on how to debug
@@ -813,6 +840,7 @@ static void print_cpu_stall(struct rcu_state *rsp)
if (ULONG_CMP_GE(jiffies, rsp->jiffies_stall))
rsp->jiffies_stall = jiffies +
3 * jiffies_till_stall_check() + 3;
+ rcu_stall_report_in_progress--;
raw_spin_unlock_irqrestore(&rnp->lock, flags);

set_need_resched(); /* kick ourselves to get things going. */

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/