Re: [PATCH 18/18] arm64: lto: Strengthen READ_ONCE() to acquire when CLANG_LTO=y

From: Arnd Bergmann
Date: Tue Jun 30 2020 - 15:25:24 EST


On Tue, Jun 30, 2020 at 7:39 PM Will Deacon <will@xxxxxxxxxx> wrote:
> +#define __READ_ONCE(x) \
> +({ \
> + int atomic = 1; \
> + union { __unqual_scalar_typeof(x) __val; char __c[1]; } __u; \
> + typeof(&(x)) __x = &(x); \
> + switch (sizeof(x)) { \
...
> + atomic ? (typeof(x))__u.__val : (*(volatile typeof(x) *)__x); \
> +})

This expands (x) nine times (five in __unqual_scala_typeof()), which can
lead to significant code bloat after preprocessing if something passes a
compound expression into READ_ONCE().
The compiler works it out eventually, but we've seen an actual slowdown
in compile speed from this recently, especially on clang.

I think if you move the

typeof(&(x)) __x = &(x);

line first, all other instances can use typeof(*__x) instead of typeof(x)
and avoid this problem. Once we make gcc-4.9 the minimum version,
this could be further improved to

__auto_type __x = &(x);

Arnd