[PATCH v2] arch/powerpc/include/asm/barrier.h: redefine rmb and wmb to lwsync

From: Kautuk Consul
Date: Wed Feb 22 2023 - 04:04:10 EST


A link from ibm.com states:
"Ensures that all instructions preceding the call to __lwsync
complete before any subsequent store instructions can be executed
on the processor that executed the function. Also, it ensures that
all load instructions preceding the call to __lwsync complete before
any subsequent load instructions can be executed on the processor
that executed the function. This allows you to synchronize between
multiple processors with minimal performance impact, as __lwsync
does not wait for confirmation from each processor."

Thats why smp_rmb() and smp_wmb() are defined to lwsync.
But this same understanding applies to parallel pipeline
execution on each PowerPC processor.
So, use the lwsync instruction for rmb() and wmb() on the PPC
architectures that support it.

Signed-off-by: Kautuk Consul <kconsul@xxxxxxxxxxxxxxxxxx>
---
arch/powerpc/include/asm/barrier.h | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/arch/powerpc/include/asm/barrier.h b/arch/powerpc/include/asm/barrier.h
index b95b666f0374..e088dacc0ee8 100644
--- a/arch/powerpc/include/asm/barrier.h
+++ b/arch/powerpc/include/asm/barrier.h
@@ -36,8 +36,15 @@
* heavy-weight sync, so smp_wmb() can be a lighter-weight eieio.
*/
#define __mb() __asm__ __volatile__ ("sync" : : : "memory")
+
+/* The sub-arch has lwsync. */
+#if defined(CONFIG_PPC64) || defined(CONFIG_PPC_E500MC)
+#define __rmb() __asm__ __volatile__ ("lwsync" : : : "memory")
+#define __wmb() __asm__ __volatile__ ("lwsync" : : : "memory")
+#else
#define __rmb() __asm__ __volatile__ ("sync" : : : "memory")
#define __wmb() __asm__ __volatile__ ("sync" : : : "memory")
+#endif

/* The sub-arch has lwsync */
#if defined(CONFIG_PPC64) || defined(CONFIG_PPC_E500MC)
--
2.31.1