Re: [PATCH v6 1/6] mm/mempolicy: Add MPOL_PREFERRED_MANY for multiple preferred nodes

From: Feng Tang
Date: Mon Aug 02 2021 - 07:33:38 EST


On Mon, Aug 02, 2021 at 01:14:29PM +0200, Michal Hocko wrote:
> On Mon 02-08-21 16:11:30, Feng Tang wrote:
> > On Fri, Jul 30, 2021 at 03:18:40PM +0800, Tang, Feng wrote:
> > [snip]
> > > > > One thing is, it's possible that 'nd' is not set in the preferred
> > > > > nodemask.
> > > >
> > > > Yes, and there shouldn't be any problem with that. The given node is
> > > > only used to get the respective zonelist (order distance ordered list of
> > > > zones to try). get_page_from_freelist will then use the preferred node
> > > > mask to filter this zone list. Is that more clear now?
> > >
> > > Yes, from the code, the policy_node() is always coupled with
> > > policy_nodemask(), which secures the 'nodemask' limit. Thanks for
> > > the clarification!
> >
> > Hi Michal,
> >
> > To ensure the nodemask limit, the policy_nodemask() also needs some
> > change to return the nodemask for 'prefer-many' policy, so here is a
> > updated 1/6 patch, which mainly changes the node/nodemask selection
> > for 'prefer-many' policy, could you review it? thanks!
>
> right, I have mixed it with get_policy_nodemask
>
> > @@ -1875,8 +1897,13 @@ static int apply_policy_zone(struct mempolicy *policy, enum zone_type zone)
> > */
> > nodemask_t *policy_nodemask(gfp_t gfp, struct mempolicy *policy)
> > {
> > - /* Lower zones don't get a nodemask applied for MPOL_BIND */
> > - if (unlikely(policy->mode == MPOL_BIND) &&
> > + int mode = policy->mode;
> > +
> > + /*
> > + * Lower zones don't get a nodemask applied for 'bind' and
> > + * 'prefer-many' policies
> > + */
> > + if (unlikely(mode == MPOL_BIND || mode == MPOL_PREFERRED_MANY) &&
> > apply_policy_zone(policy, gfp_zone(gfp)) &&
> > cpuset_nodemask_valid_mems_allowed(&policy->nodes))
> > return &policy->nodes;
>
> Isn't this just too cryptic? Why didn't you simply
> if (mode == MPOL_PREFERRED_MANY)
> return &policy->mode;
>
> in addition to the existing code? I mean why would you even care about
> cpusets? Those are handled at the page allocator layer and will further
> filter the given nodemask.

Ok, I will follow your suggestion and keep 'bind' handling unchanged.

And to be honest, I don't fully understand the current handling for
'bind' policy, will the returning NULL for 'bind' policy open a
sideway for the strict 'bind' limit.

Thanks,
Feng


> --
> Michal Hocko
> SUSE Labs