* Dor Laor <dor.laor@xxxxxxxxx> wrote:It works, actually I already commented it out.
Here [include/asm-x86/tsc.h]:
/* Like get_cycles, but make sure the CPU is synchronized. */
static __always_inline cycles_t get_cycles_sync(void)
{
unsigned long long ret;
unsigned eax, edx;
/*
* Use RDTSCP if possible; it is guaranteed to be synchronous
* and doesn't cause a VMEXIT on Hypervisors
*/
alternative_io(ASM_NOP3, ".byte 0x0f,0x01,0xf9", X86_FEATURE_RDTSCP,
ASM_OUTPUT2("=a" (eax), "=d" (edx)),
"a" (0U), "d" (0U) : "ecx", "memory");
ret = (((unsigned long long)edx) << 32) | ((unsigned long long)eax);
if (ret)
return ret;
/*
* Don't do an additional sync on CPUs where we know
* RDTSC is already synchronous:
*/
// alternative_io("cpuid", ASM_NOP2, X86_FEATURE_SYNC_RDTSC,
// "=a" (eax), "0" (1) : "ebx","ecx","edx","memory");
rdtscll(ret);
The patch below should resolve this - could you please test and Ack it?
2.6.24-rc for you?I tried to figure out but all the code movements for i386 go in the way.
Ingo
-------------->
Subject: x86: fix get_cycles_sync() overhead
From: Ingo Molnar <mingo@xxxxxxx>
get_cycles_sync() is causing massive overhead in KVM networking:
http://lkml.org/lkml/2007/12/11/54
remove the explicit CPUID serialization - it causes VM exits and is
pointless: we care about GTOD coherency but that goes to user-space
via a syscall, and syscalls are serialization points anyway.
Signed-off-by: Ingo Molnar <mingo@xxxxxxx>
Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
---
include/asm-x86/tsc.h | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
Index: linux-x86.q/include/asm-x86/tsc.h
===================================================================
--- linux-x86.q.orig/include/asm-x86/tsc.h
+++ linux-x86.q/include/asm-x86/tsc.h
@@ -39,8 +39,8 @@ static __always_inline cycles_t get_cycl
unsigned eax, edx;
/*
- * Use RDTSCP if possible; it is guaranteed to be synchronous
- * and doesn't cause a VMEXIT on Hypervisors
+ * Use RDTSCP if possible; it is guaranteed to be synchronous
+ * and doesn't cause a VMEXIT on Hypervisors
*/
alternative_io(ASM_NOP3, ".byte 0x0f,0x01,0xf9", X86_FEATURE_RDTSCP,
ASM_OUTPUT2("=a" (eax), "=d" (edx)),
@@ -50,11 +50,11 @@ static __always_inline cycles_t get_cycl
return ret;
/*
- * Don't do an additional sync on CPUs where we know
- * RDTSC is already synchronous:
+ * Use RDTSC on other CPUs. This might not be fully synchronous,
+ * but it's not a problem: the only coherency we care about is
+ * the GTOD output to user-space, and syscalls are synchronization
+ * points anyway:
*/
- alternative_io("cpuid", ASM_NOP2, X86_FEATURE_SYNC_RDTSC,
- "=a" (eax), "0" (1) : "ebx","ecx","edx","memory");
rdtscll(ret);
return ret;