Re: [patch for-5.3 0/4] revert immediate fallback to remote hugepages

From: Linus Torvalds
Date: Sat Sep 28 2019 - 17:03:42 EST


On Fri, Sep 27, 2019 at 12:48 AM Michal Hocko <mhocko@xxxxxxxxxx> wrote:
>
> - page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac);
> + if (!order)
> + page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac);
> if (page)
> goto got_pg;
>
> The whole point of handling this in the page allocator directly is to
> have a unified solutions rather than have each specific caller invent
> its own way to achieve higher locality.

The above just looks hacky.

Why would order-0 be special?

It really is hugepages that are special - small orders don't generally
need a lot of compaction. and this secondary get_page_from_freelist()
is not primarily about compaction, it's about relaxing some of the
other constraints, and the actual alloc_flags have changed from the
initial ones.

I really think that you're missing the big picture. We want node
locality, and we want big pages, but those "we want" simply shouldn't
be as black-and-white as they sometimes are today. In fact, the whole
black-and-white thing is completely crazy for anything but some
test-load.

I will just do the reverts and apply David's two patches on top. We
now have the time to actually test the behavior, and we're not just
before a release.

The thing is, David has numbers, and the patches make _sense_. There's
a description of what they do, there are comments, but the *code*
makes sense too. Much more sense than the above kind of insane hack.
David's patches literally do two things:

- [3/4] just admit that the allocation failed when you're trying to
allocate a huge-page and compaction wasn't successful

- [4/4] when that huge-page allocation failed, retry on another node
when appropriate

That's _literally_ what David's two patches do. The above is purely
the English translation of the patches.

And I claim that the English translation also ends up being
_sensible_. I go look at those two statements, and my reaction "yeah,
that makes sense".

So there were numbers, there was "this is sensible", and there were no
indications that those sensible choices would actually be problematic.
Nobody has actually argued against the patches making sense. Nobody
has even argued that the patches would be wrong. The _only_ argument
against them were literally "what if this changes something subtle",
together with patches like the above that do _not_ make sense either
on a big picture level or even locally on a small level.

The reason I didn't apply those patches for 5.3 was that they came in
very late, and there were horrendous numbers for the 5.2 behavior that
caused those two big reverts. But the patches made sense back then,
the timing for them just didn't.

I was hoping this would just get done in the mm tree, but clearly it
isn't, and I'll just do my own mm branch and merge the patches that
way.

Linus