Re: [PATCH] sched/cpuset: distribute tasks within affinity masks

From: Tejun Heo
Date: Wed Mar 04 2020 - 12:11:56 EST


On Thu, Feb 27, 2020 at 05:01:34PM -0800, Josh Don wrote:
> From: Paul Turner <pjt@xxxxxxxxxx>
>
> Currently, when updating the affinity of tasks via either cpusets.cpus,
> or, sched_setaffinity(); tasks not currently running within the newly
> specified CPU will be arbitrarily assigned to the first CPU within the
> mask.
>
> This (particularly in the case that we are restricting masks) can
> result in many tasks being assigned to the first CPUs of their new
> masks.
>
> This:
> 1) Can induce scheduling delays while the load-balancer has a chance to
> spread them between their new CPUs.
> 2) Can antogonize a poor load-balancer behavior where it has a
> difficult time recognizing that a cross-socket imbalance has been
> forced by an affinity mask.
>
> With this change, tasks are distributed ~evenly across the new mask. We
> may intentionally move tasks already running on a CPU within the mask to
> avoid edge cases in which a CPU is already overloaded (or would be
> assigned to more times than is desired).
>
> We specifically apply this behavior to the following cases:
> - modifying cpuset.cpus
> - when tasks join a cpuset
> - when modifying a task's affinity via sched_setaffinity(2)

Looks fine to me. Peter, what do you think?

Thanks.

--
tejun