Re: [PATCH] mm/mempolicy: Fix an incorrect rebind node in mpol_rebind_nodemask

From: Vlastimil Babka
Date: Thu Jun 27 2019 - 06:03:34 EST


On 5/25/19 9:07 AM, zhong jiang wrote:
> We bind an different node to different vma, Unluckily,
> it will bind different vma to same node by checking the /proc/pid/numa_maps.
> Commit 213980c0f23b ("mm, mempolicy: simplify rebinding mempolicies when updating cpusets")
> has introduced the issue. when we change memory policy by seting cpuset.mems,
> A process will rebind the specified policy more than one times.
> if the cpuset_mems_allowed is not equal to user specified nodes. hence the issue will trigger.
> Maybe result in the out of memory which allocating memory from same node.

OK, how about this instead?

mpol_rebind_nodemask() is called for MPOL_BIND and MPOL_INTERLEAVE
mempoclicies when the tasks's cpuset's mems_allowed changes. For
policies created without MPOL_F_STATIC_NODES or MPOL_F_RELATIVE_NODES,
it works by remapping the policy's allowed nodes (stored in v.nodes)
using the previous value of mems_allowed (stored in
w.cpuset_mems_allowed) as the domain of map and the new mems_allowed
(passed as nodes) as the range of the map (see the comment of
bitmap_remap() for details).

The result of remapping is stored back as policy's nodemask in v.nodes,
and the new value of mems_allowed should be stored in
w.cpuset_mems_allowed to facilitate the next rebind, if it happens.

However, commit 213980c0f23b ("mm, mempolicy: simplify rebinding
mempolicies when updating cpusets") introduced a bug where the result of
remapping is stored in w.cpuset_mems_allowed instead. Thus, a
mempolicy's allowed nodes can evolve in an unexpected way after a series
of rebinding due to cpuset mems_allowed changes, possibly binding to a
wrong node or a smaller number of nodes which may e.g. overload them.
This patch fixes the bug so rebinding again works as intended.

> Fixes: 213980c0f23b ("mm, mempolicy: simplify rebinding mempolicies when updating cpusets")
> Signed-off-by: zhong jiang <zhongjiang@xxxxxxxxxx>

(an example of what exactly was the sequence of set_mempolicy and cpuset
mems changes with expected wrt actual results would be nice, but I think
the above should be fine by itself)

Reviewed-by: Vlastimil Babka <vbabka@xxxxxxx>

> ---
> mm/mempolicy.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index e3ab1d9..a60a3be 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -345,7 +345,7 @@ static void mpol_rebind_nodemask(struct mempolicy *pol, const nodemask_t *nodes)
> else {
> nodes_remap(tmp, pol->v.nodes,pol->w.cpuset_mems_allowed,
> *nodes);
> - pol->w.cpuset_mems_allowed = tmp;
> + pol->w.cpuset_mems_allowed = *nodes;
> }
>
> if (nodes_empty(tmp))
>