Re: [PATCH] sched_groups are expected to be circular linked list,make it so right after allocation

From: Igor Mammedov
Date: Wed May 09 2012 - 07:59:03 EST


On 05/09/2012 01:52 PM, Peter Zijlstra wrote:
On Wed, 2012-05-09 at 13:44 +0200, Igor Mammedov wrote:
This patch fixes only build_sched_groups path, but there is another fail path
that results in below OOPS.
build_overlap_sched_groups() may exit without setting groups and later it will crash
init_sched_groups_power as well.

if that allocation fails? Or is there another fail path?

build_overlap_sched_groups(struct sched_domain *sd, int cpu)
...
if (cpumask_test_cpu(cpu, sg_span))
groups = sg;
...

above test fails and leaves local var groups set to NULL
and before exit there is:

sd->groups = groups;

which resets sd->groups to NULL and I'm not sure if it is correct at all to skip this
assignment if groups == NULL.


But I just don't know how to fix it, so I've just
posted partial fix that reduces crash frequency.


And I have to admit that
cpu_active_mask and siblings map are busted but we either should not exit from builder
funcs with NULL group or BUG there if it is impossible to come-up with sane group
for insane domain span.

I'm perfectly OK with taking the machine down, provided we can output
useful messages as to what is broken first..

--
-----
Igor
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/