Re: [PATCH v3 00/21] mm: lru_lock splitting

From: Konstantin Khlebnikov
Date: Sat Feb 25 2012 - 00:31:41 EST


KAMEZAWA Hiroyuki wrote:
On Thu, 23 Feb 2012 17:51:36 +0400
Konstantin Khlebnikov<khlebnikov@xxxxxxxxxx> wrote:

v3 changes:
* inactive-ratio reworked again, now it always calculated from from scratch
* hierarchical pte reference bits filter in memory-cgroup reclaimer
* fixed two bugs in locking, found by Hugh Dickins
* locking functions slightly simplified
* new patch for isolated pages accounting
* new patch with lru interleaving

This patchset is based on next-20120210

git: https://github.com/koct9i/linux/commits/lruvec-v3


I wonder.... I just wonder...if we can split a lruvec in a zone into small
pieces of lruvec and have splitted LRU-lock per them, do we need per-memcg-lrulock ?

What per-memcg-lrulock? I don't have it.
last patch splits lruvecs in memcg with the same factor.


It seems per-memcg-lrulock can be much bigger lock than small-lruvec-lock.
(depends on configuraton) and much more complicated..and have to take care
of many things.. If unit of splitting can be specified by boot option,
it seems admins can split a big memcg's per-memcg-lru lock into more small pieces.

lruvec count per memcg can be arbitrary and changeable if cgroup is empty.
This is not in this patch, but it's really easy.


BTW, how to think of default size of splitting ? I wonder splitting lru into
the number of cpus per a node can be a choice. Each cpu may have a chance to
set prefered-pfn-range at page allocation with additional patches.

If we rework page to memcg linking and add direct lruvec-id into page->flags,
we will able to change lruvec before inserting page to lru.
Thus each cpu will always insert pages into its own lruvec in zone.
I have not thought about races yet, but this would be perfect solution.


Thanks,
-Kame


---

Konstantin Khlebnikov (21):
memcg: unify inactive_ratio calculation
memcg: make mm_match_cgroup() hirarchical
memcg: fix page_referencies cgroup filter on global reclaim
memcg: use vm_swappiness from target memory cgroup
mm: rename lruvec->lists into lruvec->pages_lru
mm: lruvec linking functions
mm: add lruvec->pages_count
mm: unify inactive_list_is_low()
mm: add lruvec->reclaim_stat
mm: kill struct mem_cgroup_zone
mm: move page-to-lruvec translation upper
mm: push lruvec into update_page_reclaim_stat()
mm: push lruvecs from pagevec_lru_move_fn() to iterator
mm: introduce lruvec locking primitives
mm: handle lruvec relocks on lumpy reclaim
mm: handle lruvec relocks in compaction
mm: handle lruvec relock in memory controller
mm: add to lruvec isolated pages counters
memcg: check lru vectors emptiness in pre-destroy
mm: split zone->lru_lock
mm: zone lru vectors interleaving


include/linux/huge_mm.h | 3
include/linux/memcontrol.h | 75 ------
include/linux/mm.h | 66 +++++
include/linux/mm_inline.h | 19 +-
include/linux/mmzone.h | 39 ++-
include/linux/swap.h | 6
mm/Kconfig | 16 +
mm/compaction.c | 31 +--
mm/huge_memory.c | 14 +
mm/internal.h | 204 +++++++++++++++++
mm/ksm.c | 2
mm/memcontrol.c | 343 +++++++++++-----------------
mm/migrate.c | 2
mm/page_alloc.c | 70 +-----
mm/rmap.c | 2
mm/swap.c | 217 ++++++++++--------
mm/vmscan.c | 534 ++++++++++++++++++++++++--------------------
mm/vmstat.c | 6
18 files changed, 932 insertions(+), 717 deletions(-)

--
Signature



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/