Re: current linux-2.6.git: cpusets completely broken

From: Max Krasnyansky
Date: Fri Jul 11 2008 - 16:07:20 EST


Vegard Nossum wrote:
On Fri, Jul 11, 2008 at 9:36 PM, Paul Menage <menage@xxxxxxxxxx> wrote:
On Fri, Jul 11, 2008 at 12:07 PM, Vegard Nossum <vegard.nossum@xxxxxxxxx> wrote:
The result of having CPUSETS enabled as above is a 100% reproducible
BUG on the very first cpu hot-unplug:

------------[ cut here ]------------
kernel BUG at xxx/linux-2.6/kernel/sched.c:5859!
That doesn't quite match up with any BUG in 2.6.26-rc9 - what tree is
this last crash based on?

latest mainline. Commit e5a5816f7875207cb0a0a7032e39a4686c5e10a4.

Is this one:

/* called under rq->lock with disabled interrupts */
static void migrate_dead(unsigned int dead_cpu, struct task_struct *p)
{
struct rq *rq = cpu_rq(dead_cpu);

/* Must be exiting, otherwise would be on tasklist. */
BUG_ON(!p->exit_state);

Also, this is on the latest linux-2.6.git! Since we're so close to
release, maybe cpusets should simply be marked BROKEN for now? (Unless
we can fix it, of course. The alternative is to apply Miao Xie's
workaround patch temporarily.)
If we were going to mark anything as broken, wouldn't cpu-hotplug be
the more appropriate victim? I suspect that there are more systems
using cpusets in production environments than using cpu hotplug. But
as you say, fixing it sounds better.

I'm sorry for the harsh characterization and suggestion; please accept
my apology. It was purely a result of my excitement at having made
some progress in this case.

But I have more good news; reverting this:

commit f18f982abf183e91f435990d337164c7a43d1e6d
Author: Max Krasnyansky <maxk@xxxxxxxxxxxx>
Date: Thu May 29 11:17:01 2008 -0700

sched: CPU hotplug events must not destroy scheduler domains created by the
cpusets

First issue is not related to the cpusets. We're simply leaking doms_cur.
It's allocated in arch_init_sched_domains() which is called for every
hotplug event. So we just keep reallocation doms_cur without freeing it.
I introduced free_sched_domains() function that cleans things up.

Second issue is that sched domains created by the cpusets are
completely destroyed by the CPU hotplug events. For all CPU hotplug
events scheduler attaches all CPUs to the NULL domain and then puts
them all into the single domain thereby destroying domains created
by the cpusets (partition_sched_domains).
The solution is simple, when cpusets are enabled scheduler should not
create default domain and instead let cpusets do that. Which is
exactly what the patch does.

Signed-off-by: Max Krasnyansky <maxk@xxxxxxxxxxxx>
Cc: pj@xxxxxxx
Cc: menage@xxxxxxxxxx
Cc: rostedt@xxxxxxxxxxx
Acked-by: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>

gets rid of the BUG! (Added people to Ccs.)
Really ? Just by looking at the backtraces in your first email it seems unrelated.

Might I instead suggest a revert of this? (Again, unless somebody else
can spot the real error and fix it before 2.6.26 is out :-))
I'd actually be ok with reverting it. Paul and I were looking into some circular locking issues triggered by the very same patch. Since we do not have a solution yet we could revert it for now and work on a fix during .27-rc series.

Max


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/