Re: [RFC][PATCH 11/12] slub: Replace cmpxchg_double()

From: Linus Torvalds
Date: Tue Jan 03 2023 - 14:09:14 EST


On Tue, Jan 3, 2023 at 9:17 AM Heiko Carstens <hca@xxxxxxxxxxxxx> wrote:
>
> On Mon, Dec 19, 2022 at 04:35:36PM +0100, Peter Zijlstra wrote:
> >
> > Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> > ---
> > include/linux/slub_def.h | 12 ++-
> > mm/slab.h | 41 +++++++++++--
> > mm/slub.c | 146 ++++++++++++++++++++++++++++-------------------
> > 3 files changed, 135 insertions(+), 64 deletions(-)
>
> Does this actually work? Just wondering since I end up with an instant
> list corruption on s390. Might be endianness related, but I can't see
> anything obvious at a first glance.

I don't see anything that looks related to endianness, because while
there is that 128-bit union member, it's always either used in full,
or it's accessed as other union members.

But I *do* note that this patch seems to be the only one that depends
on the new this_cpu_cmpxchg() updates to make it just automatically do
the right thing for a 128-bit value. And I have to admit that all
those games with __pcpu_cast_128() make no sense to me. Why isn't it
just using "u128" everywhere without any odd _Generic() games?

I could also easily see that if the asm constraints are wrong (like
the "cast pointer to (unsigned long *) instead of keeping it pointing
to a 128-bit type" thing discussed earlier), then code like this:

+ freelist_aba_t old = { .freelist = freelist_old, .counter = tid };
+ freelist_aba_t new = { .freelist = freelist_new, .counter =
next_tid(tid) };
+
+ return this_cpu_cmpxchg(s->cpu_slab->freelist_tid.full,
+ old.full, new.full) == old.full;

would easily make the compiler go "the second word of 'old' is never
used by the asm, so I won't initialize it".

But yeah, that patch is hard to read, so hard to say. Does everything
leading up to it work fine?

Linus