Re: [PATCH 2.6.16-mm1 2/2] sched_domains: Allocate sched_groups dynamically

From: Siddha, Suresh B
Date: Thu Jul 06 2006 - 20:16:27 EST


I have reviewed Srivatsa's code back then and made sure it fixes
the problem in the presence of multi-core too. Based on my review,
I provided feedback to Srivatsa then.

In short, multi-core was broken too and Srivatsa's patch fixed it.

thanks,
suresh

On Thu, Jul 06, 2006 at 05:01:51PM -0700, Paul Jackson wrote:
> Several months ago, Srivatsa wrote:
> > I couldn't test this on a multi-core machine, since I don't think we
> > have one in our lab.
> >
> > Suresh, would you mind testing the patch on a multi-core machine, in case you
> > have access to one?
> >
> > Basically you would need to do create a exclusive CPUset with one CPU in it
> > (ensure that its sibling in the same core is not part of the same
> > CPUset). As soon as you make the CPUset exclusive, you would hit some
> > kind of hang. With this patch, the hang should go away.
>
>
> Summary: Where do we stand with multi-core and this bug?
>
>
> I don't see a reply from Suresh on whether he could test on multi-core.
>
> I finally happened to be running on a hyper-threaded box last week,
> and stumbled over this bug that Srivatsa's patch fixes. Hawkes
> remembered Srivatsa's patch, I tried it, and it worked. Thanks!
>
> But now I'm quite confused as to the situation with multi-core.
>
> Details of my confusions, for the bored:
>
> From Srivatsa's remark, I would have guessed that multi-core was
> at risk for this bug too, but Srivatsa was hopeful that his patch
> would fix that too.
>
> Early this week, a couple of people who shall remain anonymous here
> raised the question of whether we had the same problem with multi-core.
> One of them believed that multi-core did have the same problem.
>
> I got a little time on a multi-core system this morning to test it,
> and while running what I -thought- was a kernel -without- Srivatsa's
> patch, I could not find any problem. I made a cpuset with just a
> single logical cpu in it, and marked it cpu_exclusive, and the
> system did not hang.
>
> It will be another day before I can get on that multi-core system
> again to verify my findings.
>
> I was hoping that someone could actually -read- this code and state
> with confidence that one of the following held:
> * it was already working ok on multi-core (a one CPU cpu_exclusive cpuset),
> * it was broken, but Srivatsa's patch fixes it, or
> * it's still broken, even with Srivatsa's patch.
>
> I tried a couple of times to read the code myself, but could not
> make any headway there.
>
> So ... what's up with multi-core and this bug?
>
> --
> I won't rest till it's the best ...
> Programmer, Linux Scalability
> Paul Jackson <pj@xxxxxxx> 1.925.600.0401
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/