Re: [PATCH] mm/slub: fix a BUG_ON() when offlining a memory nodeand CONFIG_SLUB_DEBUG is on

From: Jiang Liu
Date: Wed Jul 18 2012 - 12:53:03 EST

Next message: Chris Metcalf: "[PATCH 1/3] net: tilegx driver bugfix (be explicit about percpu queue number)"
Previous message: Stefano Stabellini: "Re: [PATCH WIP 6/6] xen/arm: enable evtchn irqs"
In reply to: Christoph Lameter: "RE: [PATCH] mm/slub: fix a BUG_ON() when offlining a memory nodeand CONFIG_SLUB_DEBUG is on"
Next in thread: Christoph Lameter: "Re: [PATCH] mm/slub: fix a BUG_ON() when offlining a memory nodeand CONFIG_SLUB_DEBUG is on"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi Chris,
I found the previous analysis of the BUG_ON() issue is incorrect after
another round of code review.
The really issue is that function early_kmem_cache_node_alloc() calls
inc_slabs_node(kmem_cache_node, node, page->objects) to increase the object
count on local node no matter whether page is allocated from local or remote
node. With current implementation it's OK because every memory node has normal
memory so page is allocated from local node. Now we are working on a patch set
to improve memory hotplug. The basic idea is to to let some memory nodes only
host ZONE_MOVABLE zone, so we could easily remove the whole memory node when
needed. That means some memory nodes have no ZONE_NORMAL/ZONE_DMA, and the page
will be allocated from remote node in function early_kmem_cache_node_alloc().
But early_kmem_cache_node_alloc() still increases object count on local node,
which triggers the BUG_ON eventually when removing the affected memory node.
I will try to work out another version for it.
Thanks!
Gerry

On 07/18/2012 01:39 AM, Christoph Lameter wrote:
> On Wed, 18 Jul 2012, Jiang Liu wrote:
>
>> From: Jianguo Wu <wujianguo@xxxxxxxxxx>
>>
>> From: Jianguo Wu <wujianguo@xxxxxxxxxx>
>>
>> SLUB allocator may cause a BUG_ON() when offlining a memory node if
>> CONFIG_SLUB_DEBUG is on. The scenario is:
>>
>> 1) when creating kmem_cache_node slab, it cause inc_slabs_node() twice.
>> early_kmem_cache_node_alloc
>> ->new_slab
>> ->inc_slabs_node
>> ->inc_slabs_node
>
> New slab will not be able to increment the slab counter. It will
> check that there is no per node structure yet and then skip the inc slabs
> node.
>
> This suggests that a call to early_kmem_cache_node_alloc was not needed
> because the per node structure already existed. Lets fix that instead.
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Chris Metcalf: "[PATCH 1/3] net: tilegx driver bugfix (be explicit about percpu queue number)"
Previous message: Stefano Stabellini: "Re: [PATCH WIP 6/6] xen/arm: enable evtchn irqs"
In reply to: Christoph Lameter: "RE: [PATCH] mm/slub: fix a BUG_ON() when offlining a memory nodeand CONFIG_SLUB_DEBUG is on"
Next in thread: Christoph Lameter: "Re: [PATCH] mm/slub: fix a BUG_ON() when offlining a memory nodeand CONFIG_SLUB_DEBUG is on"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]