Ok. To me, the rest of the thread are beating around the same points and
no one is giving ground. The points are made so lets summarise. Apologies
if anything is missing.
Who cares
=========
Physical hotplug remove: Vendors of the hardware that support this -
Fujitsu, HP (I think), IBM etc
Virtualization hotplug remove: Sellers of virtualization software, some
hardware like any IBM machine that lists LPAR in it's list of
features. Probably software solutions like Xen are also affected
if they want to be able to grow and shrink the virtual machines on
demand
High order allocations: Ultimately, hugepage users. Today, that is a
feature only big server users like Oracle care about. In the
future I reckon applications will be able to use them for things
like backing the heap by huge pages. Other users like GigE,
loopback devices with large MTUs, some filesystem like CIFS are
all interested although they are also been told use use smaller
pages.
Pros/Cons of Solutions
======================
Anti-defrag Pros
o Aim9 shows no significant regressions (.37% on page_test). On some
tests, it shows performance gains (> 5% on fork_test)
o Stress tests show that it manages to keep fragmentation down to a far
lower level even without teaching kswapd how to linear reclaim
o Stress tests with a linear reclaim experimental patch shows that it
can successfully find large contiguous chunks of memory
o It is known to help hotplug on PPC64
o No tunables. The approach tries to manage itself as much as possible
o It exists, heavily tested, and synced against the latest -mm1
o Can be compiled away be redefining the RCLM_* macros and the
__GFP_*RCLM flags
Anti-defrag Cons
o More complexity within the page allocator
o Adds a new layer onto the allocator that effectively creates subzones
o Adding a new concept that maintainers have to work with
o Depending on the workload, it fragments anyway
New Zone Pros
o Zones are a well known and understood concept
o For people that do not care about hotplug, they can easily get rid of it
o Provides reliable areas of contiguous groups that can be freed for
HugeTLB pages going to userspace
o Uses existing zone infrastructure for balancing
New Zone Cons
o Zones historically have introduced balancing problems
o Been tried for hotplug and dropped because of being awkward to work with
o It only helps hotplug and potentially HugeTLB pages for userspace
o Tunable required. If you get it wrong, the system suffers a lot
o Needs to be planned for and developed
Scenarios
=========
Lets outline some situations then or workloads that can occur
1. Heavy job running that consumes 75% of physical memory. Like a kernel
build
Anti-defrag: It will not fragment as it will never have to fallback.High
order allocations will be possible in the remaining 25%.
Zone-based: After been tuned to a kernel build load, it will not
fragment. Get the tuning wrong, performance suffers or workload
fails. High order allocations will be possible in the remaining 25%.
I've tried to be as objective as possible with the summary.
From the points above though, I think that anti-defrag gets us a lot ofthe way, with the complexity isolated in one place. It's downside is that
it can still break down and future work is needed to stop it degrading
(kswapd cleaning UserRclm areas and page migration when we get really
stuck). Zone-based is more reliable but only addresses a limited
situation, principally hotplug and it does not even go 100% of the way for
hotplug.
If we make the zones growable+shrinkable, we run into all the same
problems that anti-defrag has today.