Re: [patch] slub: default min_partial to at least highest cpus per node

From: Pekka Enberg
Date: Tue Apr 07 2009 - 15:58:57 EST


David Rientjes wrote:
On Tue, 7 Apr 2009, Pekka Enberg wrote:

Hmm, partial lists are per-node, so wouldn't it be better to do the
adjustment for every struct kmem_cache_node separately? The
'min_partial_per_node' global seems just too ugly and confusing to live
with.
Btw, that requires moving ->min_partial to struct kmem_cache_node from
struct kmem_cache. But I think that makes a whole lot of sense if
some nodes may have more CPUs than others.

And while the improvement is kinda obvious, I would be interested to
know what kind of workload benefits from this patch (and see numbers
if there are any).


It doesn't really depend on the workload, it depends on the type of NUMA machine it's running on (and whether that NUMA is asymmetric amongst cpus).

Since min_partial_per_node is capped at MAX_PARTIAL, this is only really relevant for remote node defragmentation if it's allowed (and not just 2% of the time like the default). We want to avoid stealing partial slabs from remote nodes if there are fewer than the number of cpus on that node.

Otherwise, it's possible for each cpu on the victim node to try to allocate a single object and require nr_cpus_node(node) new slab allocations. In this case it's entirely possible for the majority of cpus to have cpu slabs from remote nodes. This change reduces the liklihood of that happening because we'll always have cpu slab replacements on our local partial list before allowing remote defragmentation.

I'd be just as happy with the following, although it would require changing MIN_PARTIAL to be greater than its default of 5 if a node supports more cpus for optimal performance (the old patch did that automatically up to MAX_PARTIAL).

Hmm but why not move ->min_partial to struct kmem_cache_node as I suggested and make sure it's adjusted properly as with nr_cpus_node()?

diff --git a/mm/slub.c b/mm/slub.c
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1326,11 +1326,13 @@ static struct page *get_any_partial(struct kmem_cache *s, gfp_t flags)
zonelist = node_zonelist(slab_node(current->mempolicy), flags);
for_each_zone_zonelist(zone, z, zonelist, high_zoneidx) {
struct kmem_cache_node *n;
+ int node;
- n = get_node(s, zone_to_nid(zone));
+ node = zone_to_nid(zone);
+ n = get_node(s, node);
if (n && cpuset_zone_allowed_hardwall(zone, flags) &&
- n->nr_partial > s->min_partial) {
+ n->nr_partial > nr_cpus_node(node)) {
page = get_partial_node(n);
if (page)
return page;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/