Re: [BUG] race of RCU vs NOHU

From: Paul E. McKenney
Date: Mon Aug 31 2009 - 10:30:45 EST


On Mon, Aug 31, 2009 at 10:47:28AM +0200, Martin Schwidefsky wrote:
> On Fri, 21 Aug 2009 08:54:18 -0700
> "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> > On Wed, Aug 12, 2009 at 09:32:33AM +0200, Martin Schwidefsky wrote:
> > > On Tue, 11 Aug 2009 11:04:07 -0700
> > > "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> > > > On Tue, Aug 11, 2009 at 05:17:51PM +0200, Martin Schwidefsky wrote:
> > > > > On Tue, 11 Aug 2009 07:52:22 -0700

[ . . . ]

> > > > > We found the bug with kernel version 2.6.30 - the kernel on our test systems
> > > > > still use classic RCU. For us it is easy to switch to tree-RCU, no patch
> > > > > required.
> > > >
> > > > Ah! Could you please send me the test you use? My tests were
> > > > insufficient to force this problem to happen.
> > >
> > > There is no specific test, just a regular system boot. The boot did not
> > > finish and our tester took a dump. This boot failure seems to happen from
> > > time to time.
> >
> > OK. Has CONFIG_TREE_RCU been working for you? If so, which variant
> > of 2.6.27 do you need a backport to?
>
> We changed the configuration of our test kernels to CONFIG_TREE_RCU. So
> far the problem has not shown up again. As we a dealing with a rare race
> here this has to be taken with a grain of salt.

Thank you for trying it out!

Did you by any chance record the success and failure statistic? Perhaps
something like number of failures per unit time, time to first failure,
number of successful vs. failed reboots, or whatever? This would allow
calculation of confidence statistics.

Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/