Re: [PATCH mmotm] mempolicy: mbind_range() set_policy() after vma_merge()

From: Hugh Dickins
Date: Fri Mar 04 2022 - 17:33:45 EST


On Fri, 4 Mar 2022, Oleg Nesterov wrote:
> On 03/03, Hugh Dickins wrote:
> >
> > Just delete that optimization now (though it could be made conditional
> > on vma not having a set_policy). Also remove the "next" variable:
> > it turned out to be blameless, but also pointless.
> >
> > Fixes: 3964acd0dbec ("mm: mempolicy: fix mbind_range() && vma_adjust() interaction")
>
> I can't believe I ever looked at this code ;)

;)

>
> Acked-by: Oleg Nesterov <oleg@xxxxxxxxxx>

Thanks.

>
> offtopic question... can't vma_replace_policy() check vm_ops && vm_ops->set_policy
> at the start, before mpol_dup() ?

You are probably correct, though that is not how I did or would do it.

For me too, it's a long time since I've been in here: I have a large
number of cleanups to mm/mempolicy.c, done in the early days of shmem
huge pages (and leading up to NUMA migration outside of mmap lock),
but never the time to carry them forward; and quite a few other people
have made their own cleanups here too, so hard work to integrate it all.

This particular patch here stood out in my mind as an actual bugfix:
our people found shmem pages being allocated on the wrong node because
of it; last time I tried to work in this area, I did pull this one out
and push it into our tree, before approaching any of the rest of it.

If I look at the last of our trees which had the full set of cleanups,
I see that my vma_replace_policy() had less in, but mbind_range() doing
what you have in mind, checking vm_ops->set_policy in between the
mpol_equal() check at the top of the loop and the vma_merge() below it:
with a comment "If set_policy() is implemented, the vma is nothing more
than a window on to the shared object, and we do not set vm_policy and
we do not split the vma" (then calls the set_policy()).

But I think that depends on other changes in other patches (not
setting vm_policy): now, just as usual, I absolutely do not have
the time to get back into that. So, thanks for the observation,
but I have to stick with just this bugfix for now (a long now!).

It's good to hear from you, Oleg: stay well.

Hugh