Re: [PATCH 06/15] sched: Reschedule task on preferred NUMA node onceselected

From: Peter Zijlstra
Date: Sat Jul 06 2013 - 06:39:34 EST


On Sat, Jul 06, 2013 at 12:08:53AM +0100, Mel Gorman wrote:
> +static int
> +find_idlest_cpu_node(int this_cpu, int nid)
> +{
> + unsigned long load, min_load = ULONG_MAX;
> + int i, idlest_cpu = this_cpu;
> +
> + BUG_ON(cpu_to_node(this_cpu) == nid);
> +
> + rcu_read_lock();
> + for_each_cpu(i, cpumask_of_node(nid)) {
> + load = weighted_cpuload(i);
> +
> + if (load < min_load) {
> + /*
> + * Kernel threads can be preempted. For others, do
> + * not preempt if running on their preferred node
> + * or pinned.
> + */
> + struct task_struct *p = cpu_rq(i)->curr;
> + if ((p->flags & PF_KTHREAD) ||
> + (p->numa_preferred_nid != nid && p->nr_cpus_allowed > 1)) {
> + min_load = load;
> + idlest_cpu = i;
> + }

So I really don't get this stuff.. if it is indeed the idlest cpu preempting
others shouldn't matter. Also, migrating a task there doesn't actually mean it
will get preempted either.

In overloaded scenarios it expected that multiple tasks will run on the same
cpu. So this condition will also explicitly make overloaded scenarios work less
well.

> + }
> + }
> + rcu_read_unlock();
> +
> + return idlest_cpu;
> +}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/