Re: [PATCH v3] mm: make expand_downwards symmetrical toexpand_upwards

From: James Bottomley
Date: Thu Apr 21 2011 - 12:06:49 EST


On Wed, 2011-04-20 at 14:42 -0700, David Rientjes wrote:
> On Wed, 20 Apr 2011, Christoph Lameter wrote:
>
> > There is barely any testing going on at all of this since we have had this
> > issue for more than 5 years and have not noticed it. The absence of bug
> > reports therefore proves nothing. Code inspection of the VM shows
> > that this is an issue that arises in multiple subsystems and that we have
> > VM_BUG_ONs in the page allocator that should trigger for these situations.
> >
> > Usage of DISCONTIGMEM and !NUMA is not safe and should be flagged as such.
> >
>
> We don't actually have any bug reports in front of us that show anything
> else in the VM other than slub has issues with this configuration, so
> marking them as broken is probably premature. The parisc config that
> triggered this debugging enables CONFIG_SLAB by default, so it probably
> has gone unnoticed just because nobody other than James has actually tried
> it on hppa64.
>
> Let's see if KOSAKI-san's fixes to Kconfig (even though I'd prefer the
> simpler and implicit "config NUMA def_bool ARCH_DISCONTIGMEM_ENABLE" over
> his config NUMA) and my fix to parisc to set the bit in N_NORMAL_MEMORY
> so that CONFIG_SLUB initializes kmem_cache_node correctly works and then
> address issues in the core VM as they arise. Presumably someone has been
> running DISCONTIGMEM on hppa64 in the past five years without issues with
> defconfig, so the issue here may just be slub.

Actually, we can fix slub. As far as all my memory hammer tests go, the
one liner below is the actual fix (it just forces slub get_node() to
return the zero node always on !NUMA). That, as far as a code
inspection goes, seems to make SLUB as good as SLAB ... as long as
no-one uses hugepages or VM DEBUG, which, I think we've demonstrated, is
the case for all the current DISCONTIGMEM users.

I think either the above or just marking slub broken in DISCONTIGMEM & !
NUMA is sufficient for stable. The fix is getting urgent, because
debian (which is what most of our users are running) has made SLUB the
default allocator, which is why we're now starting to run into these
panic reports.

The set memory range fix looks good for a backport too ... at least the
page cache is now no-longer reluctant to use my upper 1GB ...

I worry a bit more about backporting the selection of NUMA as a -stable
fix because it's a larger change (and requires changes to all the
architectures, since NUMA is an arch local Kconfig variable)

James

----

diff --git a/mm/slub.c b/mm/slub.c
index 94d2a33..243bd9c 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -235,7 +235,11 @@ int slab_is_available(void)

static inline struct kmem_cache_node *get_node(struct kmem_cache *s, int node)
{
+#ifdef CONFIG_NUMA
return s->node[node];
+#else
+ return s->node[0];
+#endif
}

/* Verify that a pointer has an address that is valid within a slab page */


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/