Re: [PATCH] mm, mempolicy: fix up gup usage in lookup_node

From: Michal Hocko
Date: Tue Apr 21 2020 - 10:46:10 EST


On Tue 21-04-20 09:29:16, Peter Xu wrote:
> On Tue, Apr 21, 2020 at 09:10:26AM +0200, Michal Hocko wrote:
> > From: Michal Hocko <mhocko@xxxxxxxx>
> >
> > ba841078cd05 ("mm/mempolicy: Allow lookup_node() to handle fatal signal") has
> > added a special casing for 0 return value because that was a possible
> > gup return value when interrupted by fatal signal. This has been fixed
> > by ae46d2aa6a7f ("mm/gup: Let __get_user_pages_locked() return -EINTR
> > for fatal signal") in the mean time so ba841078cd05 can be reverted.
> >
> > This patch however doesn't go all the way to revert it because the check
> > for 0 is wrong and confusing here. Firstly it is inherently unsafe to
> > access the page when get_user_pages_locked returns 0 (aka no page
> > returned).
> > Fortunatelly this will not happen because get_user_pages_locked will not
> > return 0 when nr_pages > 0 unless FOLL_NOWAIT is specified which is not
> > the case here. Document this potential error code in gup code while we
> > are at it.
> >
> > Signed-off-by: Michal Hocko <mhocko@xxxxxxxx>
> > ---
> > mm/gup.c | 5 +++++
> > mm/mempolicy.c | 5 +----
> > 2 files changed, 6 insertions(+), 4 deletions(-)
> >
> > diff --git a/mm/gup.c b/mm/gup.c
> > index 50681f0286de..a8575b880baf 100644
> > --- a/mm/gup.c
> > +++ b/mm/gup.c
> > @@ -980,6 +980,7 @@ static int check_vma_flags(struct vm_area_struct *vma, unsigned long gup_flags)
> > * -- If nr_pages is >0, but no pages were pinned, returns -errno.
> > * -- If nr_pages is >0, and some pages were pinned, returns the number of
> > * pages pinned. Again, this may be less than nr_pages.
> > + * -- 0 return value is possible when the fault would need to be retried.
> > *
> > * The caller is responsible for releasing returned @pages, via put_page().
> > *
> > @@ -1247,6 +1248,10 @@ int fixup_user_fault(struct task_struct *tsk, struct mm_struct *mm,
> > }
> > EXPORT_SYMBOL_GPL(fixup_user_fault);
> >
> > +/*
> > + * Please note that this function, unlike __get_user_pages will not
> > + * return 0 for nr_pages > 0 without FOLL_NOWAIT
>
> It's a bit unclear to me on whether "will not return 0" applies to "this
> function" or "__get_user_pages"... Might be easier just to avoid mentioning
> __get_user_pages?

I really wanted to call out __get_user_pages because the semantic of
0 return value is different. If you have a suggestion how to reformulate
this to be more clear then I will incorporate that.

> > + */
> > static __always_inline long __get_user_pages_locked(struct task_struct *tsk,
> > struct mm_struct *mm,
> > unsigned long start,
> > diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> > index 48ba9729062e..1965e2681877 100644
> > --- a/mm/mempolicy.c
> > +++ b/mm/mempolicy.c
> > @@ -927,10 +927,7 @@ static int lookup_node(struct mm_struct *mm, unsigned long addr)
> >
> > int locked = 1;
> > err = get_user_pages_locked(addr & PAGE_MASK, 1, 0, &p, &locked);
> > - if (err == 0) {
> > - /* E.g. GUP interrupted by fatal signal */
> > - err = -EFAULT;
> > - } else if (err > 0) {
> > + if (err > 0) {
> > err = page_to_nid(p);
> > put_page(p);
> > }
>
> Again, this is my totally humble opinion: I'm fine with removing the comment,
> however I still don't think it's helpful at all to explicitly remove a check
> against invalid return value (err==0), especially if that's the only functional
> change in this patch.

I thought I have explained that when we have discussed last time and the
changelog is explaining that as well. Checking for impossible error code
is simply confusing and provokes for copy&pasting this pattern. I
wouldn't really bother if I haven't seen this cargo cult pattern in the
so many times.
--
Michal Hocko
SUSE Labs