Re: [GIT PULL rcu/next] rcu commits for 2.6.40

From: Ingo Molnar
Date: Mon May 16 2011 - 07:52:56 EST



* Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx> wrote:

> On Mon, May 16, 2011 at 09:08:08AM +0200, Ingo Molnar wrote:
> >
> > * Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> >
> > > > Would it have been possible to split it in two, one for the movement of the
> > > > notifiers, the other for the barrier changes?
> > > >
> > > > That way the bisection would have fingered the movement commit. Or so.
> > >
> > > In hindsight, that certainly would have been better.
> >
> > This is the Linux kernel and we *can* turn back the clock!
>
> Yay for source-code control systems in general and git in particular! ;-)
>
> > > I was afraid of that...
> > >
> > > On the off-chance that moving the memory barriers was at fault, the following
> > > patch restores all of them that don't have in situ replacements. Grasping at
> > > straws, admittedly.
> >
> > Well, the nice thing is that we really do not have to grasp at straws, and even
> > while we have no good ideas we can debug this *much* better.
> >
> > Could you please do a simple test-tree that does has 3 commits:
> >
> > first one reverts the offending commit
> > second one applies the barrier part of it
> > this one applies the need_resched part of it
> >
> > ( You can do even more finegrained steps, if you find harmless-looking bits of
> > it that can be applied separately! )
> >
> > Note, the important thing is that the tree should be a 'null pull' - i.e. the
> > revert plus the patches applied will not change anything in core/rcu.
> >
> > Obviously it would be nice if each step built fine - no need to boot test each
> > step as long as you are reasonably sure it will boot fine.
> >
> > Then i could take my reproducer and come up with a very precise bisection
> > result for you, with just a couple of minutes time spent on testing. One of the
> > commits after the revert will trigger the hang/slowdown.
> >
> > My prediction is that we will be much wiser after that! :-)
>
> I will put this together!
>
> In the meantime, would you be willing to try out the patch at
> https://lkml.org/lkml/2011/5/14/89? This patch helped out Yinghai
> in several configurations.

Wasn't this the one i tested - or is it a new iteration?

I'll try it in any case.

If the bug is fixed for good then we can learn no more from it and then i'd
suggest for you to not waste much time with a more finegrained queue :-)

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/