Re: [PATCH v2 18/19] sched/numa: Reset scan rate whenever task moves across nodes

From: Mel Gorman
Date: Thu Jun 21 2018 - 06:05:12 EST


On Wed, Jun 20, 2018 at 10:32:59PM +0530, Srikar Dronamraju wrote:
> @@ -6668,6 +6662,19 @@ static void migrate_task_rq_fair(struct task_struct *p, int new_cpu __maybe_unus
>
> /* We have migrated, no longer consider this task hot */
> p->se.exec_start = 0;
> +
> +#ifdef CONFIG_NUMA_BALANCING
> + if (!p->mm || (p->flags & PF_EXITING))
> + return;
> +
> + if (p->numa_faults) {
> + int src_nid = cpu_to_node(task_cpu(p));
> + int dst_nid = cpu_to_node(new_cpu);
> +
> + if (src_nid != dst_nid)
> + p->numa_scan_period = task_scan_start(p);
> + }
> +#endif
> }
>

We talked about this before but I would at least suggest that you not
reset the scan if moving to the preferred node or if the node movement
has nothing to do with the preferred nid. e.g.

/*
* Ignore if the migration is not changing node, if it is migrating to
* the preferred node or moving between two nodes that are not preferred
*/

if (p->numa_faults) {
int src_nid = cpu_to_node(task_cpu(p));
int dst_nid = cpu_to_node(new_cpu);

if (src_nid == dst_nid || dst_nid == p->numa_preferred_nid ||
(p->numa_preferred_nid != -1 && src_nid != p->numa_preferred_nid))
return;

p->numa_scan_period = task_scan_start(p);

Note too that the next scan can be an arbitrary amount of time in the
future. Consider as an alternative to schedule an immediate scan instead
of adjusting the rate with

p->mm->numa_next_scan = jiffies;

That might be less harmful in terms of overhead while still collecting
some data in the short-term.

--
Mel Gorman
SUSE Labs