Re: [PATCH] mm: move_pages: fix the return value if there are not-migrated pages

From: Yang Shi
Date: Wed Jan 22 2020 - 12:27:05 EST




On 1/22/20 12:06 AM, Michal Hocko wrote:
On Tue 21-01-20 11:01:30, Yang Shi wrote:

On 1/21/20 12:40 AM, Michal Hocko wrote:
On Tue 21-01-20 09:44:16, Wei Yang wrote:
On Mon, Jan 20, 2020 at 02:17:44PM +0100, Michal Hocko wrote:
On Mon 20-01-20 14:06:26, Michal Hocko wrote:
On Sat 18-01-20 13:26:43, Yang Shi wrote:
The do_move_pages_to_node() might return > 0 value, the number of pages
that are not migrated, then the value will be returned to userspace
directly. But, move_pages() syscall would just return 0 or errno. So,
we need reset the return value to 0 for such case as what pre-v4.17 did.
The patch is wrong. migrate_pages returns the number of pages it
_hasn't_ migrated or -errno. Yeah that semantic sucks but...
So err != 0 is always an error. Except err > 0 doesn't really provide
any useful information to the userspace. I cannot really remember what
was the actual behavior before my rework because there were some gotchas
hidden there.
OK, so I've double checked. do_move_page_to_node_array would carry the
error code over to do_pages_move and it would store the status stored
in the pm array. It contains page_to_nid(page) so the resulting code
indeed behaves properly before my change and this is a regression. I
Thanks, I see the change.

have a very vague recollection that this has been brought up already.
<...looks in notes...>
Found it! The report is
http://lkml.kernel.org/r/0329efa0984b9b0252ef166abb4498c0795fab36.1535113317.git.jstancek@xxxxxxxxxx
and my proposed workaround was http://lkml.kernel.org/r/20180829145537.GZ10223@xxxxxxxxxxxxxx
Well, the above two links return 404.
You are right. They are not archived for some reason. Anyway, the patch
I was proposing back then is below:

commit cfb88c266b645197135cde2905c2bfc82f6d82a9
Author: Michal Hocko <mhocko@xxxxxxxx>
Date: Wed Nov 14 12:19:09 2018 +0100

mm: fix do_pages_move error reporting
a49bd4d71637 ("mm, numa: rework do_pages_move") has changed the way how
we report error to layers above. As the changelog mentioned the semantic
was quite unclear previously because the return 0 could mean both
success and failure.
The above mentioned commit didn't get all the way down to fix this
completely because it doesn't report pages that we even haven't
attempted to migrate and therefore we cannot simply say that the
semantic is:
- err < 0 - errno
- err >= 0 number of non-migrated pages.
Fixes: a49bd4d71637 ("mm, numa: rework do_pages_move")
Signed-off-by: Michal Hocko <mhocko@xxxxxxxx>
Thanks, Michal. But, it looks this patch still could return > 0 value (the
total number of non-migrated pages, including not even attempted pages) too,
but the problem we are trying to fix is to make do_pages_move() return <= 0
value only since the man page of move_pages() doesn't allow return > 0
value.
Yes this patch just lives with the changed semantic and tries to make it
sensible. So if some page cannot be migrated then we just stop and
return the number of non migrated pages at the tail of the given array.
This would make error handling slightly easier because you know that
count - ret pages of the array can be skipped if ret >= 0.

OK, I see. Returning > 0 value sounds more straightforward for userspace error handling.

BTW, we should update manpage to reflect the semantic change to indicate > 0 return value as an error case.


And, by looking into the old code (v4.16), I spotted another problem. The
migrate_pages() would store the migration failure error code into
page_to_node->status. So, When do_move_page_to_node_array() returns > 0
value, the return value would be reset to 0 and the migration error codes
for non-migrated pages would be stored into status to return to userspace.
But, the rework removed this.

I didn't dig into the intention of the rework, is it expected?
I have tried to preserve the original semantic as possible. As explained
in the changelog there were quite some discrepancies even before. This
new one was not really intentional. We have effectively two options
here. Either somebody really depend on the former semantic and we have
to fix this or we can relax the semantic as the above patch attempts.

I would be more inclined for the second option as nobody has complained
about the new semantic except for few ltp tests which do not represent
real workload. If you have a real usecase then speak up please.

No, I don't have any real usecase. And, I tend to agree the most users may not care the reason of migration failure at all. Returning the number of non-migrated pages seems more straightforward.

I agree we could stick with the new semantic and fix the return value as what your patch did. I'm going to rebase your patch on top of Wei Yang's cleanup if you don't mind.