Re: current linux-2.6.git: cpusets completely broken

From: Vegard Nossum
Date: Fri Jul 11 2008 - 15:43:31 EST


On Fri, Jul 11, 2008 at 9:36 PM, Paul Menage <menage@xxxxxxxxxx> wrote:
> On Fri, Jul 11, 2008 at 12:07 PM, Vegard Nossum <vegard.nossum@xxxxxxxxx> wrote:
>>
>> The result of having CPUSETS enabled as above is a 100% reproducible
>> BUG on the very first cpu hot-unplug:
>>
>> ------------[ cut here ]------------
>> kernel BUG at xxx/linux-2.6/kernel/sched.c:5859!
>
> That doesn't quite match up with any BUG in 2.6.26-rc9 - what tree is
> this last crash based on?

latest mainline. Commit e5a5816f7875207cb0a0a7032e39a4686c5e10a4.

Is this one:

/* called under rq->lock with disabled interrupts */
static void migrate_dead(unsigned int dead_cpu, struct task_struct *p)
{
struct rq *rq = cpu_rq(dead_cpu);

/* Must be exiting, otherwise would be on tasklist. */
BUG_ON(!p->exit_state);

>> Also, this is on the latest linux-2.6.git! Since we're so close to
>> release, maybe cpusets should simply be marked BROKEN for now? (Unless
>> we can fix it, of course. The alternative is to apply Miao Xie's
>> workaround patch temporarily.)
>
> If we were going to mark anything as broken, wouldn't cpu-hotplug be
> the more appropriate victim? I suspect that there are more systems
> using cpusets in production environments than using cpu hotplug. But
> as you say, fixing it sounds better.

I'm sorry for the harsh characterization and suggestion; please accept
my apology. It was purely a result of my excitement at having made
some progress in this case.

But I have more good news; reverting this:

commit f18f982abf183e91f435990d337164c7a43d1e6d
Author: Max Krasnyansky <maxk@xxxxxxxxxxxx>
Date: Thu May 29 11:17:01 2008 -0700

sched: CPU hotplug events must not destroy scheduler domains created by the
cpusets

First issue is not related to the cpusets. We're simply leaking doms_cur.
It's allocated in arch_init_sched_domains() which is called for every
hotplug event. So we just keep reallocation doms_cur without freeing it.
I introduced free_sched_domains() function that cleans things up.

Second issue is that sched domains created by the cpusets are
completely destroyed by the CPU hotplug events. For all CPU hotplug
events scheduler attaches all CPUs to the NULL domain and then puts
them all into the single domain thereby destroying domains created
by the cpusets (partition_sched_domains).
The solution is simple, when cpusets are enabled scheduler should not
create default domain and instead let cpusets do that. Which is
exactly what the patch does.

Signed-off-by: Max Krasnyansky <maxk@xxxxxxxxxxxx>
Cc: pj@xxxxxxx
Cc: menage@xxxxxxxxxx
Cc: rostedt@xxxxxxxxxxx
Acked-by: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>

gets rid of the BUG! (Added people to Ccs.)

Might I instead suggest a revert of this? (Again, unless somebody else
can spot the real error and fix it before 2.6.26 is out :-))


Vegard

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/