Re: [regression] 3.0-rc boot failure -- bisected to cd4ea6ae3982

From: Anton Blanchard
Date: Wed Jul 20 2011 - 06:14:43 EST



Hi Peter,

> That looks very strange indeed.. up to node 23 there is the normal
> symmetric matrix with all the trace elements on 10 (as we would expect
> for local access), and some 4x4 sub-matrix stacked around the trace
> with 20, suggesting a single hop distance, and the rest on 40 being
> out-there.

I retested with the latest version of numactl, and get correct results.

I worked out why the patches don't boot, we weren't allocating any
space for the cpumask and ran off the end of the allocation.

Should we also use cpumask_copy instead of open coding it? I added that
too.

Anton

Index: linux-2.6/kernel/sched.c
===================================================================
--- linux-2.6.orig/kernel/sched.c 2011-07-20 01:54:08.191668781 -0500
+++ linux-2.6/kernel/sched.c 2011-07-20 04:45:36.203750525 -0500
@@ -7020,8 +7020,8 @@
if (cpumask_test_cpu(i, covered))
continue;

- sg = kzalloc_node(sizeof(struct sched_group), GFP_KERNEL,
- cpu_to_node(i));
+ sg = kzalloc_node(sizeof(struct sched_group) + cpumask_size(),
+ GFP_KERNEL, cpu_to_node(i));

if (!sg)
goto fail;
@@ -7031,7 +7031,7 @@
child = *per_cpu_ptr(sdd->sd, i);
if (child->child) {
child = child->child;
- *sg_span = *sched_domain_span(child);
+ cpumask_copy(sg_span, sched_domain_span(child));
} else
cpumask_set_cpu(i, sg_span);



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/