Re: [PATCH] lib/genalloc: use try_cmpxchg in {set,clear}_bits_ll

From: Uros Bizjak
Date: Wed Jan 18 2023 - 16:56:14 EST


On Wed, Jan 18, 2023 at 10:47 PM Uros Bizjak <ubizjak@xxxxxxxxx> wrote:
>
> On Wed, Jan 18, 2023 at 10:18 PM Andrew Morton
> <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > On Wed, 18 Jan 2023 16:07:03 +0100 Uros Bizjak <ubizjak@xxxxxxxxx> wrote:
> >
> > > Use try_cmpxchg instead of cmpxchg (*ptr, old, new) == old in
> > > {set,clear}_bits_ll. x86 CMPXCHG instruction returns success in ZF
> > > flag, so this change saves a compare after cmpxchg (and related move
> > > instruction in front of cmpxchg).
> > >
> > > Also, try_cmpxchg implicitly assigns old *ptr value to "old"
> > > when cmpxchg fails.
> > >
> > > Note that the value from *ptr should be read using READ_ONCE to prevent
> > > the compiler from merging, refetching or reordering the read.
> > >
> > > The patch also declares these two functions inline, to ensure inlining.
> >
> > But why is that better? This adds a few hundred bytes more text, which
> > has a cost.
>
> Originally, both functions are inlined and the size of an object file
> is (gcc version 12.2.1, x86_64):
>
> text data bss dec hex filename
> 4661 480 0 5141 1415 genalloc-orig.o
>
> When try_cmpxchg is used, gcc chooses to not inline set_bits_ll (its
> estimate of code size is not very precise when multi-line assembly is
> involved), resulting in:
>
> text data bss dec hex filename
> 4705 488 0 5193 1449 genalloc-noinline.o
>
> And with an inline added to avoid gcc's quirks:
>
> text data bss dec hex filename
> 4629 480 0 5109 13f5 genalloc.o
>
> Considering that these two changed functions are used only in
> genalloc.o, adding inline qualifier is a win, also when comparing to
> the original size.

BTW: Recently, it was determined [1] that the usage of cpu_relax()
inside the cmpxchg loop can be harmful for performance. We actually
have the same situation here, so perhaps cpu_relax() should be removed
in the same way it was removed from the lockref.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f5fe24ef17b5fbe6db49534163e77499fb10ae8c

Uros.