Re: [PATCH rcu/urgent 0/6] Fixes for RCU/scheduler/irq-threadstrainwreck

From: Paul E. McKenney
Date: Wed Jul 20 2011 - 17:25:57 EST


On Wed, Jul 20, 2011 at 01:54:49PM -0700, Ben Greear wrote:
> On 07/20/2011 01:33 PM, Paul E. McKenney wrote:
> >On Wed, Jul 20, 2011 at 09:57:42PM +0200, Ingo Molnar wrote:
> >>
> >>* Ingo Molnar<mingo@xxxxxxx> wrote:
> >>
> >>>
> >>>* Paul E. McKenney<paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> >>>
> >>>>If my guess is correct, then the minimal non-RCU_BOOST fix is #4
> >>>>(which drags along #3) and #6. Which are not one-liners, but
> >>>>somewhat smaller:
> >>>>
> >>>> b/kernel/rcutree_plugin.h | 12 ++++++------
> >>>> b/kernel/softirq.c | 12 ++++++++++--
> >>>> kernel/rcutree_plugin.h | 31 +++++++++++++++++++++++++------
> >>>> 3 files changed, 41 insertions(+), 14 deletions(-)
> >>>
> >>>That's half the patch size and half the patch count.
> >>>
> >>>PeterZ's question is relevant: since we apparently had similar bugs
> >>>in v2.6.39 as well, what changed in v3.0 that makes them so urgent
> >>>to fix?
> >>>
> >>>If it's just better instrumentation that proves them better then
> >>>i'd suggest fixing this in v3.1 and not risking v3.0 with an
> >>>unintended side effect.
> >>
> >>Ok, i looked some more at the background and the symptoms that people
> >>are seeing: kernel crashes and lockups. I think we want these
> >>problems fixed in v3.0, even if it was the recent introduction of
> >>RCU_BOOST that made it really prominent.
> >>
> >>Having put some testing into your rcu/urgent branch today i'd feel
> >>more comfortable with taking this plus perhaps an RCU_BOOST disabling
> >>patch. That makes it all fundamentally tested by a number of people
> >>(including those who reported/reproduced the problems).
> >
> >RCU_BOOST is currently default=n. Is that sufficient? If not, one
>
> Not if it remains broken I think..unless you put it under CONFIG_BROKEN
> or something. Otherwise, folks are liable to turn it on and not realize
> it's the cause of subtle bugs.

Good point, I could easily add "depends on BROKEN".

> For what it's worth, my tests have been running clean for around 2 hours, so the full set of
> fixes with RCU_BOOST appears good, so far. I'll let it continue to run
> at least overnight to make sure I'm not just getting lucky...

Continuing to think good thoughts... ;-)

Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/