Memory retained in Per-CPU Pages (PCP) caches can prevent hugepage
allocations from succeeding despite sufficient free system memory. This
occurs because:
1. Hugepage allocations don't actively trigger PCP draining
2. Direct reclaim path fails to trigger drain_all_pages() when:
a) All zone pages are free/hugetlb (!did_some_progress)
b) Compaction skips due to costly order watermarks (COMPACT_SKIPPED)
Reproduction:
- Alloc page and free the page via put_page to release to pcp
- Observe hugepage reservation failure
Solution:
Actively drain PCP during direct reclaim for memory allocations.
This increases page allocation success rate by making stranded pages
available to any order allocations.
Verification:
This issue can be reproduce easily in zone movable with the following
step:
w/o this patch
# numactl -m 2 dd if=/dev/urandom of=/dev/shm/testfile bs=4k count=64
# rm -f /dev/shm/testfile
# sync
# echo 3 > /proc/sys/vm/drop_caches
# echo 2048 > /sys/devices/system/node/node2/hugepages/hugepages-2048kB/nr_hugepages
# cat /sys/devices/system/node/node2/hugepages/hugepages-2048kB/nr_hugepages
2029
w/ this patch
# numactl -m 2 dd if=/dev/urandom of=/dev/shm/testfile bs=4k count=64
# rm -f /dev/shm/testfile
# sync
# echo 3 > /proc/sys/vm/drop_caches
# echo 2048 > /sys/devices/system/node/node2/hugepages/hugepages-2048kB/nr_hugepages
# cat /sys/devices/system/node/node2/hugepages/hugepages-2048kB/nr_hugepages
2047