[BUG] Commit d065bd81 severely regresses huge page allocationsuccess rates

From: Mel Gorman
Date: Thu Nov 11 2010 - 07:16:07 EST


When testing 2.6.37-rc1, I noticed that huge page allocation success
rates were severely impaired. Bisection showed that commit [d065bd81: mm:
retry page fault when blocking on disk transfer] was the biggest factor.
Reverting the patch confirmed this. Here are the results of a high-order
allocation stress test. The vanilla kernel is 2.6.37-rc1 and the revert
kernel has this commit removed with minor conflicts cleaned up.

STRESS-HIGHALLOC
highalloc-vanilla revert-d065bd81
Pass 1 7.00 ( 0.00%) 73.00 (66.00%)
Pass 2 7.00 ( 0.00%) 92.00 (85.00%)
At Rest 13.00 ( 0.00%) 93.00 (80.00%)

The "pass 1" and "pass 2" are allocation attempts while the machine is
under heavy load. One might expect that allocations fail there but when the
machine is fully at rest, all memory freed and nothing else is going on,
the pages still cannot be allocated.

I had ftrace enabled and found this.

FTrace Reclaim Statistics: vmscan
vanilla revert-d065bd81
Direct reclaims 3687 889
Direct reclaim pages scanned 39767013 195182
Direct reclaim pages reclaimed 115079 107891
Direct reclaim write file async I/O 13598 5777
Direct reclaim write anon async I/O 70886 40954
Direct reclaim write file sync I/O 0 0
Direct reclaim write anon sync I/O 37 178
Wake kswapd requests 6508 868
Kswapd wakeups 1291 521
Kswapd pages scanned 77859381 3240330
Kswapd pages reclaimed 2548099 1965881
Kswapd reclaim write file async I/O 51266 56838
Kswapd reclaim write anon async I/O 935070 392199
Kswapd reclaim write file sync I/O 0 0
Kswapd reclaim write anon sync I/O 0 0
Time stalled direct reclaim (seconds) 1160.57 636.24
Time kswapd awake (seconds) 1453.81 654.25

Total pages scanned 117626394 3435512
Total pages reclaimed 2663178 2073772
%age total pages scanned/reclaimed 2.26% 60.36%
%age total pages scanned/written 0.91% 14.44%
%age file pages scanned/written 0.06% 1.82%
Percentage Time Spent Direct Reclaim 25.92% 15.98%
Percentage Time kswapd Awake 65.57% 36.01%

Reverting the commit improves overall reclaim behaviour when allocating huge
pages. Note in particular the low percentage for scanned/reclaimed in the
vanilla kernel which implies the vanilla kernel is endlessly scans pages it
cannot reclaim. I also note that with the vanilla kernel that nr_inactive_*
remains high but when the patch is reverted, it drops implying that the
patch is preventing pages being reclaimed.

It does not make a difference if compaction is used - the figures are
still brutal.

I have not digested what the patch is doing but am reporting it in case
people familiar with the patch spot the problem quickly.

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/