Re: [PATCH 06/15] sched: Reschedule task on preferred NUMA node onceselected

From: Mel Gorman
Date: Mon Jul 08 2013 - 04:35:36 EST


On Sat, Jul 06, 2013 at 12:38:13PM +0200, Peter Zijlstra wrote:
> On Sat, Jul 06, 2013 at 12:08:53AM +0100, Mel Gorman wrote:
> > +static int
> > +find_idlest_cpu_node(int this_cpu, int nid)
> > +{
> > + unsigned long load, min_load = ULONG_MAX;
> > + int i, idlest_cpu = this_cpu;
> > +
> > + BUG_ON(cpu_to_node(this_cpu) == nid);
> > +
> > + rcu_read_lock();
> > + for_each_cpu(i, cpumask_of_node(nid)) {
> > + load = weighted_cpuload(i);
> > +
> > + if (load < min_load) {
> > + /*
> > + * Kernel threads can be preempted. For others, do
> > + * not preempt if running on their preferred node
> > + * or pinned.
> > + */
> > + struct task_struct *p = cpu_rq(i)->curr;
> > + if ((p->flags & PF_KTHREAD) ||
> > + (p->numa_preferred_nid != nid && p->nr_cpus_allowed > 1)) {
> > + min_load = load;
> > + idlest_cpu = i;
> > + }
>
> So I really don't get this stuff.. if it is indeed the idlest cpu preempting
> others shouldn't matter. Also, migrating a task there doesn't actually mean it
> will get preempted either.
>

At one point this was part of a patch that swapped tasks on the target
node where it really was preempting the running task as the comment
describes. Swapping was premature because it was not evaluating if the
swap would improve performance overall. You're right, this check should
be removed entirely and it will be in the next update.

Thanks.

--
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/