[patch for-5.3 0/4] revert immediate fallback to remote hugepages

From: David Rientjes
Date: Wed Sep 04 2019 - 15:54:19 EST


Two commits:

commit a8282608c88e08b1782141026eab61204c1e533f
Author: Andrea Arcangeli <aarcange@xxxxxxxxxx>
Date: Tue Aug 13 15:37:53 2019 -0700

Revert "mm, thp: restore node-local hugepage allocations"

commit 92717d429b38e4f9f934eed7e605cc42858f1839
Author: Andrea Arcangeli <aarcange@xxxxxxxxxx>
Date: Tue Aug 13 15:37:50 2019 -0700

Revert "Revert "mm, thp: consolidate THP gfp handling into alloc_hugepage_direct_gfpmask""

made their way into 5.3-rc5

We (mostly Linus, Andrea, and myself) have been discussing offlist how to
implement a sane default allocation strategy for hugepages on NUMA
platforms.

With these reverts in place, the page allocator will happily allocate a
remote hugepage immediately rather than try to make a local hugepage
available. This incurs a substantial performance degradation when
memory compaction would have otherwise made a local hugepage available.

This series reverts those reverts and attempts to propose a more sane
default allocation strategy specifically for hugepages. Andrea
acknowledges this is likely to fix the swap storms that he originally
reported that resulted in the patches that removed __GFP_THISNODE from
hugepage allocations.

The immediate goal is to return 5.3 to the behavior the kernel has
implemented over the past several years so that remote hugepages are
not immediately allocated when local hugepages could have been made
available because the increased access latency is untenable.

The next goal is to introduce a sane default allocation strategy for
hugepages allocations in general regardless of the configuration of the
system so that we prevent thrashing of local memory when compaction is
unlikely to succeed and can prefer remote hugepages over remote native
pages when the local node is low on memory.

Merging these reverts late in the rc cycle to change behavior that has
existed for years and is known (and acknowledged) to create performance
degradations when local hugepages could have been made available serves
no purpose other than to make the development of a sane default policy
much more difficult under time pressure and to accelerate decisions that
will affect how userspace is written (and how it has regressed) that
otherwise require carefully crafted and detailed implementations.

Thus, this patch series returns 5.3 to the long-standing allocation
strategy that Linux has had for years and proposes to follow-up changes
that can be discussed that Andrea acknowledges will avoid the swap storms
that initially triggered this discussion in the first place.