Re: [PATCH] sched_groups are expected to be circular linked list,make it so right after allocation

From: Peter Zijlstra
Date: Thu May 10 2012 - 09:45:57 EST


On Thu, 2012-05-10 at 15:26 +0200, Igor Mammedov wrote:
> [ 141.699854] sched: Bonkers domain doesn't include its own cpu: 3 0-1,3
> [ 141.725038] sched: Bonkers domain doesn't include its own cpu: 3 0-1

Whee!! so cpu_mask (active_mask) does include 3, but the tl->mask()
doesn't.

> [ 141.775040] sched: Topology is hosed for CPU-3!!
> [ 141.775596] sched: domain: NODE 0-1
> [ 141.776004] sched: group: 0-1
>
This seems to suggest its the node topology being wrecked.

which with your code-base would be
cpu_node_mask()->sched_domain_node_span()..

Did you specify any node topology on the qemu command line? If not, it
should all reduce to cpumask_of_node(0).

identify_secondary_cpu()->identify_cpu()->numa_add_cpu() should set that
bit. which is well before the CPU_ONLINE->cpuset_update_active_cpus()
sched domain rebuild.


Most puzzling. Can you dig a little deeper as to why these masks might
be wrong? Also, can you reproduce on actual hardware? The reason I never
use kvm or other virt for debugging is that I always end up spending
time chasing virt bugs, and I hate virt..
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/