Re: [PATCH] slab: Fix obj_ext is mistakenly considered NULL due to race condition

From: Hao Ge

Date: Thu Oct 23 2025 - 05:11:58 EST


Hi Harry


On 2025/10/23 17:06, Harry Yoo wrote:
On Thu, Oct 23, 2025 at 04:46:42PM +0800, Hao Ge wrote:
Hi Harry


On 2025/10/23 16:23, Hao Ge wrote:
Hi Harry


On 2025/10/23 15:50, Harry Yoo wrote:
On Thu, Oct 23, 2025 at 11:11:56AM +0800, Hao Ge wrote:
Hi Harry


On 2025/10/23 10:24, Harry Yoo wrote:
On Thu, Oct 23, 2025 at 09:21:17AM +0800, Hao Ge wrote:
From: Hao Ge <gehao@xxxxxxxxxx>

If two competing threads enter alloc_slab_obj_exts(), and the
thread that failed to allocate the object extension vector exits
after the one that succeeded, it will mistakenly assume slab->obj_ext
is still empty due to its own allocation failure. This
will then trigger
warnings enforced by CONFIG_MEM_ALLOC_PROFILING_DEBUG checks in
the subsequent free path.

Therefore, let's add an additional check when
alloc_slab_obj_exts fails.

Signed-off-by: Hao Ge <gehao@xxxxxxxxxx>
---
   mm/slub.c | 9 ++++++---
   1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index d4403341c9df..42276f0cc920 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2227,9 +2227,12 @@ prepare_slab_obj_exts_hook(struct
kmem_cache *s, gfp_t flags, void *p)
       slab = virt_to_slab(p);
       if (!slab_obj_exts(slab) &&
           alloc_slab_obj_exts(slab, s, flags, false)) {
-        pr_warn_once("%s, %s: Failed to create slab
extension vector!\n",
-                 __func__, s->name);
-        return NULL;
+        /* Recheck if a racing thread has successfully
allocated slab->obj_exts. */
+        if (!slab_obj_exts(slab)) {
+            pr_warn_once("%s, %s: Failed to create slab
extension vector!\n",
+                     __func__, s->name);
+            return NULL;
+        }
       }
Maybe this patch is a bit paranoid... since if
mark_failed_objexts_alloc()
win cmpxchg() and then someone else allocates the object
extension vector,
the warning will still be printed anyway.
Oh, just to be clear I was talking about the other warning:
pr_warn_once("%s, %s: Failed to create slab extension vector!",
__func__, s->name);

The process that successfully allocates slab_exts will call
handle_failed_objexts_alloc, setting ref->ct = CODETAG_EMPTY
to prevent the warning from being triggered.
But yeah I see what you mean.

As you mentioned, if the process that failed to allocate the vector wins
cmpxchg(), later process that successfully allocate the vector would
call set_codetag_empty(), so no warning.

But if the process that allocates the vector wins cmpxchg(),
then it won't call set_codetag_empty(), so the process
that was trying to set OBJEXTS_ALLOC_FAIL now needs to set the tag.
Yes, the case I'm encountering is exactly this one.

But anyway, I think there is a better way to do this:
What do you think about the diff I suggested below, though?
Sorry for the delayed response earlier; I was trying to deduce all
possible scenarios.

It makes sense to me, and I will submit the V2 version based on this
suggestion.

Thank you for your help.

diff --git a/mm/slub.c b/mm/slub.c
index dd4c85ea1038..d08d7580349d 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2052,9 +2052,9 @@ static inline void
mark_objexts_empty(struct slabobj_ext *obj_exts)
       }
   }
-static inline void mark_failed_objexts_alloc(struct slab *slab)
+static inline bool mark_failed_objexts_alloc(struct slab *slab)
   {
-    cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL);
+    return cmpxchg(&slab->obj_exts, 0, OBJEXTS_ALLOC_FAIL) == 0;
   }
   static inline void handle_failed_objexts_alloc(unsigned
long obj_exts,
@@ -2076,7 +2076,7 @@ static inline void
handle_failed_objexts_alloc(unsigned long obj_exts,
   #else /* CONFIG_MEM_ALLOC_PROFILING_DEBUG */
   static inline void mark_objexts_empty(struct slabobj_ext
*obj_exts) {}
-static inline void mark_failed_objexts_alloc(struct slab *slab) {}
+static inline bool mark_failed_objexts_alloc(struct slab
*slab) { return true; }
Maybe it returns false here.

When CONFIG_MEM_ALLOC_PROFILING_DEBUG is not enabled,

The following condition will never be executed:

if (!mark_failed_objexts_alloc(slab) && slab_obj_exts(slab))
Good point. But without CONFIG_MEM_ALLOC_PROFILING_DEBUG, we don't know
if someone else successfully allocated the vector or not (unlike, with
CONFIG_MEM_ALLOC_PROFILING_DEBUG enabled, we know that when we lose
cmpxchg()). We cannot "fix" the case where a process fails to allocate
the vector but another allocates the vector.

So I'm not sure if checking slab_obj_exts() once more is worth it in
this case, but I'm fine with either way.

if another process that allocates the vector, we will lose one count.
By "one count" you mean skipping accounting the object in memory
profiling, right?
Yes.

   static inline void handle_failed_objexts_alloc(unsigned
long obj_exts,
               struct slabobj_ext *vec, unsigned int objects) {}
@@ -2125,7 +2125,9 @@ int alloc_slab_obj_exts(struct slab
*slab, struct kmem_cache *s,
       }
       if (!vec) {
           /* Mark vectors which failed to allocate */
-        mark_failed_objexts_alloc(slab);
+        if (!mark_failed_objexts_alloc(slab) &&
+            slab_obj_exts(slab))
+            return 0;
           return -ENOMEM;
       }