Re: Performance overhead of get_cycles_sync

From: Dor Laor
Date: Tue Dec 11 2007 - 09:57:35 EST


Andi Kleen wrote:
[headers rewritten because of gmane crosspost breakage]

In the latest kernel (2.6.24-rc3) I noticed a drastic performance decrease for KVM networking.

That should not have changed for quite some time.

Also it depends on the CPU of course.
I didn't find the exact place of the change but using fedora 2.6.23-8 there is no problem.
3aefbe0746580a710d4392a884ac1e4aac7c728f turn X86_FEATURE_SYNC_RDTSC off for most
intel cpus, but it was committed in May.

The reason is many vmexit (exit reason is cpuid instruction) caused by
calls to gettimeofday that uses tsc sourceclock.
read_tsc calls get_cycles_sync which might call cpuid in order to serialize the cpu.

Can you explain why the cpu needs to be serialized for every gettime call?

Otherwise RDTSC can be speculated around and happen outside the protection
of the seqlock and that can sometimes lead to non monotonic time reporting.
What about moving the result into memory and calling mb() instead?
Anyways after a lot of discussions it turns out there are ways to archive
this without CPUID and there is a solution implemented for this in ff
tree which I will submit for .25. It's a little complicated though
and not a quick fix.

Do we need to be that accurate? (It will also slightly improve physical hosts).
I believe you have a reason and the answer is yes. In that case can you replace the serializing instruction
with an instruction that does not trigger vmexit? Maybe use 'ltr' for example?

ltr doesn't synchronize RDTSC.

According to Intel spec it is a serializing instruction along with cpuid and others.
-Andi


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/