I don't see other ways to optimize it (and I never enjoyed too much the
per-zone lru since it has some downside too with a worst case on 2G
systems). peraphs a further optimization could be a transient per-cpu
lru refiled only by the page reclaim (so absolutely lazy while lots of
ram is free), but maybe that's already what you're doing when you say
"Adding/removing sixteen pages for one taking of the lock". Though the
fact you say "sixteen pages" sounds like it's not as lazy as it could
be.