Re: [tip:core/rcu] rcu: Consolidate sparse and lockdep declarationsin include/linux/rcupdate.h

From: Paul E. McKenney
Date: Mon Aug 24 2009 - 12:08:05 EST


On Mon, Aug 24, 2009 at 11:28:36AM +0200, Ingo Molnar wrote:
>
> * Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
>
> > On Sun, Aug 23, 2009 at 12:33:56PM -0700, Paul E. McKenney wrote:
> > > On Sun, Aug 23, 2009 at 08:42:02PM +0200, Ingo Molnar wrote:
> > > >
> > > > * tip-bot for Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> > > >
> > > > > Commit-ID: bc33f24bdca8b6e97376e3a182ab69e6cdefa989
> > > > > Gitweb: http://git.kernel.org/tip/bc33f24bdca8b6e97376e3a182ab69e6cdefa989
> > > > > Author: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
> > > > > AuthorDate: Sat, 22 Aug 2009 13:56:47 -0700
> > > > > Committer: Ingo Molnar <mingo@xxxxxxx>
> > > > > CommitDate: Sun, 23 Aug 2009 10:32:37 +0200
> > > > >
> > > > > rcu: Consolidate sparse and lockdep declarations in include/linux/rcupdate.h
> > > >
> > > > -tip testing found a spontaneous reboot crash, which i
> > > > bisected back to this commit:
> > > >
> > > > bc33f24bdca8b6e97376e3a182ab69e6cdefa989 is first bad commit
> > > >
> > > > the reboot happens during the ftrace syscall tracepoints
> > > > self-test:
> > > >
> > > > [ 34.618832] Testing event sys_exit_set_robust_list: OK
> > > > [ 34.635511] Testing event sys_enter_get_robust_list: OK
> > > > [ 34.652164] Testing event sys_exit_get_robust_list: OK
> > > > [ 34.668844] Testing event sys_enter_futex: OK
> > > > [ 34.685495] Testing event sys_exit_futex: OK
> > > > [ 34.702170] Testing event lock_acquire: [instant reboot]
> > > >
> > > > There's no log message - just a reboot - which signals some
> > > > severe crash - perhaps some locking related infinite
> > > > recursion or something like that?
> > >
> > > Pretty impressive for having mostly moved RCU's lockdep-related
> > > declarations from one file to another... :-/
> > >
> > > Looking into it, probably a typo on my part.
> >
> > I rechecked this several times, and don't see how anything else
> > should have noticed this patch. That said, that self-test is
> > rather amazing code, so...
>
> What do you think about the infinite recursion bug that Lai
> Jiangshan's found in that patch? (sharp eyes really!) That might
> explain the crash -tip testing found.

Indeed!!!

My thought is to send you a temporary patch stack fixing that bug
(and supplying a CPU-hotplug fix), allowing your testing to progress
(but defeating bisection if using ftrace).

Then I would create a new patch stack applying the fix to the bugs Lai
Jiangshan found to commit bc33f24, rebasing, retesting, and suppling a
this new stack to replace commit bc33f24 and later in tip/core/rcu.

Seem reasonable?

Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/