[PATCH] percpu: avoid extra NOP in percpu_cmpxchg16b_double

From: Eric Dumazet
Date: Mon Mar 28 2011 - 06:32:28 EST


percpu_cmpxchg16b_double() uses alternative_io() and looks like :

e8 .. .. .. .. call this_cpu_cmpxchg16b_emu
X bytes NOPX

or, once patched (if cpu supports native instruction) on SMP build :

65 48 0f c7 0e cmpxchg16b %gs:(%rsi)
0f 94 c0 sete %al

on !SMP build :

48 0f c7 0e cmpxchg16b (%rsi)
0f 94 c0 sete %al

Therefore, NOPX should be :

P6_NOP3 on SMP
P6_NOP2 on !SMP

Signed-off-by: Eric Dumazet <eric.dumazet@xxxxxxxxx>
Cc: Christoph Lameter <cl@xxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxx>
Cc: Pekka Enberg <penberg@xxxxxxxxxx>
Cc: Tejun Heo <tj@xxxxxxxxxx>
---
arch/x86/include/asm/percpu.h | 7 ++++++-
1 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h
index d475b43..d68fca6 100644
--- a/arch/x86/include/asm/percpu.h
+++ b/arch/x86/include/asm/percpu.h
@@ -509,6 +509,11 @@ do { \
* it in software. The address used in the cmpxchg16 instruction must be
* aligned to a 16 byte boundary.
*/
+#ifdef CONFIG_SMP
+#define CMPXCHG16B_EMU_CALL "call this_cpu_cmpxchg16b_emu\n\t" P6_NOP3
+#else
+#define CMPXCHG16B_EMU_CALL "call this_cpu_cmpxchg16b_emu\n\t" P6_NOP2
+#endif
#define percpu_cmpxchg16b_double(pcp1, o1, o2, n1, n2) \
({ \
char __ret; \
@@ -517,7 +522,7 @@ do { \
typeof(o2) __o2 = o2; \
typeof(o2) __n2 = n2; \
typeof(o2) __dummy; \
- alternative_io("call this_cpu_cmpxchg16b_emu\n\t" P6_NOP4, \
+ alternative_io(CMPXCHG16B_EMU_CALL, \
"cmpxchg16b " __percpu_prefix "(%%rsi)\n\tsetz %0\n\t", \
X86_FEATURE_CX16, \
ASM_OUTPUT2("=a"(__ret), "=d"(__dummy)), \


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/