Re: [RFC] bitmap onto and fold operators for mempolicy extensions

From: David Rientjes
Date: Mon Feb 18 2008 - 19:47:00 EST


On Sat, 16 Feb 2008, Paul Jackson wrote:

> Let's say an application has specified some mempolicies
> that presume 16 memory nodes, including say a mempolicy that
> specified MPOL_F_RELATIVE_NODES (cpuset relative) nodes 12-15.
> Then lets say that application is crammed into a cpuset that only
> has 8 memory nodes, 0-7. If one just uses bitmap_onto(), this
> mempolicy, mapped to that cpuset, would ignore the requested
> relative nodes above 7, leaving it empty of nodes. That's not
> good; better to fold the higher nodes down, so that some nodes
> are included in the resulting mapped mempolicy. In this case,
> the mempolicy nodes 12-15 are taken modulo 8 (the weight of the
> mems_allowed of the confining cpuset), resulting in a mempolicy
> specifying nodes 4-7.
>

So what is the MPOL_F_RELATIVE_NODES behavior? Is it a combination of
nodes_onto() and nodes_fold()?

In your example, the only way we know to use nodes_fold() is if the
resultant of nodes_onto() has a weight of 0. An MPOL_F_RELATIVE_NODES
nodemask for 4-6, for example, works fine in your case of a cpuset with
memory nodes 0-7 and no fold is required.

So it's easy enough to do this:

case MPOL_INTERLEAVE:
if (flags & MPOL_F_RELATIVE_NODES) {
nodes_onto(pol->v.nodes, pol->user_nodemask,
cpuset_context_nmask);
if (nodes_empty(pol->v.nodes))
nodes_fold(pol->v.nodes,
pol->user_nodemask,
nodes_weight(cpuset_context_nmask));
} else {
...
}
break;

But what if we require a combination? Say the user asked for a policy of
MPOL_INTERLEAVE | MPOL_F_RELATIVE_NODES over nodes 4-8 in a cpuset
constrained to mems 0-7? Should the resultant be 0,4-7 (combination of
nodes_onto() and nodes_fold()) or simply be 4-7 (just nodes_onto())?

And what if the MPOL_INTERLEAVE | MPOL_F_RELATIVE_NODES nodemask is 0,4-8
in the same cpuset constrained to mems 0-7? Should the resultant be

- 0,4-7 (nodes_onto() and nodes_fold()),

- 0,4-7 (just nodes_onto()), or

- 0-1,4-7 (nodes_onto(), nodes_fold(), and shift)?

The last option, 0-1,4-7, is the only one that preserves the same weight
as the relative nodemask.

David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/