Re: [PATCH 0/2] Implement numa node notifier

From: Harry Yoo
Date: Thu Apr 03 2025 - 18:07:06 EST


On Thu, Apr 03, 2025 at 03:02:25PM +0200, David Hildenbrand wrote:
> On 02.04.25 19:03, Oscar Salvador wrote:
> > On Wed, Apr 02, 2025 at 06:06:51PM +0200, Vlastimil Babka wrote:
> > > What if we had two chains:
> > >
> > > register_node_notifier()
> > > register_node_normal_notifier()
> > >
> > > I think they could have shared the state #defines and struct node_notify
> > > would have just one nid and be always >= 0.
> > >
> > > Or would it add too much extra boilerplate and only slab cares?
> >
> > We could indeed go on that direction to try to decouple
> > status_change_nid from status_change_nid_normal.
> >
> > Although as you said, slub is the only user of status_change_nid_normal
> > for the time beign, so I am not sure of adding a second chain for only
> > one user.
> >
> > Might look cleaner though, and the advantatge is that slub would not get
> > notified for nodes adquiring only ZONE_MOVABLE.
> >
> > Let us see what David thinks about it.
>
> I'd hope we'd be able to get rid of the _normal stuff completely, it's seems
> way to specialized.

Hmm, perhaps we can remove it with as part of this patch series?

status_change_nid_normal has been used to indicate both 'There is a
status change' AND 'The node id when the NUMA node has normal memory'.

But since NUMA node notifier triggers only when there is a state change,
it can simply pass nid, like patch 2 does. SLUB can then check whether the
node has normal memory.

Or am I missing something?

> We added that in
>
> commit b9d5ab2562eceeada5e4837a621b6260574dd11d
> Author: Lai Jiangshan <laijs@xxxxxxxxxxxxxx>
> Date: Tue Dec 11 16:01:05 2012 -0800
>
> slub, hotplug: ignore unrelated node's hot-adding and hot-removing
> SLUB only focuses on the nodes which have normal memory and it ignores the
> other node's hot-adding and hot-removing.
> Aka: if some memory of a node which has no onlined memory is online, but
> this new memory onlined is not normal memory (for example, highmem), we
> should not allocate kmem_cache_node for SLUB.
> And if the last normal memory is offlined, but the node still has memory,
> we should remove kmem_cache_node for that node. (The current code delays
> it when all of the memory is offlined)
> So we only do something when marg->status_change_nid_normal > 0.
> marg->status_change_nid is not suitable here.
> The same problem doesn't exist in SLAB, because SLAB allocates kmem_list3
> for every node even the node don't have normal memory, SLAB tolerates
> kmem_list3 on alien nodes. SLUB only focuses on the nodes which have
> normal memory, it don't tolerate alien kmem_cache_node. The patch makes
> SLUB become self-compatible and avoids WARNs and BUGs in rare conditions.
>
>
> How "bad" would it be if we do the slab_mem_going_online_callback() etc even
> for completely-movable nodes? I assume one kmem_cache_alloc() per slab_caches.
>
> slab_mem_going_offline_callback() only does shrinking, #dontcare
>
> Looking at slab_mem_offline_callback(), we never even free the caches either
> way when offlining. So the implication would be that we would have movable-only nodes
> set in slab_nodes.
>
>
> We don't expect many such nodes, so ... do we care?
>
> --
> Cheers,
>
> David / dhildenb
>
>

--
Cheers,
Harry (formerly known as Hyeonggon)