Re: [PATCH] mm, memory_hotplug: Don't bail out in do_migrate_range prematurely

From: osalvador
Date: Tue Dec 11 2018 - 07:22:33 EST


On 2018-12-11 11:18, Michal Hocko wrote:
Currently, if we fail to isolate a single page, we put all already
isolated pages back to their LRU and we bail out from the function.
This is quite suboptimal, as this will force us to start over again
because scan_movable_pages will give us the same range.
If there is no chance that we can isolate that page, we will loop here
forever.

This is true but reorganizing the code will not help the underlying
issue. Because the permanently failing page will be still there for
scan_movable_pages to encounter.

Well, it would only help in case the page is neither LRU nor
non-movable page, then we would fail to isolate it in do_migrate_range
and we will start over.
Letting do_migrate_range do some work, would mean that at some point
the permanently failing page will not be within a range but the first one
of a range, and so scan_movable_pages will skip it.



Issue debugged in 4d0c7db96 ("hwpoison, memory_hotplug: allow hwpoisoned
pages to be offlined") has proved that.

I assume that 4d0c7db96 is a sha1 from the linux-next. Please note that
this is not going to be the case when merged upstream. So I would use a
link.

I will replace the sha1 with the link in the next version.

Although this patch has proved to be useful when dealing with
4d0c7db96 because it allows us to move forward as long as the
page is not in LRU, we still need 4d0c7db96
("hwpoison, memory_hotplug: allow hwpoisoned pages to be offlined")
to handle the LRU case and the unmapping of the page if needed.
So, this is just a follow-up cleanup.

I suspect the above paragraph is adding more confusion than necessary. I
would just drop it.

Fair enough, I will drop it.

The main question here is. Do we want to migrate as much as possible or
do we want to be conservative and bail out early. The later could be an
advantage if the next attempt could fail the whole operation because the
impact of the failed operation would be somehow reduced. The former
should be better for throughput because easily done stuff is done first.

I would go with the throuput because our failure mode is to bail out
much earlier - even before we try to migrate. Even though the detection
is not perfect it works reasonably well for most usecases.

I agree here.
I think it is better to do as much work as possible at once.


you really want to keep this branch. You just do not want to bail out.
We want to know about pages which fail to isolate and you definitely do
not want to keep the reference elevated behind. not_managed stuff can go
away.

Yeah, I just realized when I sent it.
I will fix it up in v2.

Thanks