Re: [benchmark] 1% performance overhead of paravirt_ops on nativekernels

From: Avi Kivity
Date: Tue Jun 09 2009 - 11:56:54 EST


Ingo Molnar wrote:
* Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:

I was benchmarking btrfs on my little EeePC. There, kmap overhead was 25% of file access time. Part of it is that people have been taught to use "kmap_atomic()", which is usable under spinlocks and people have been told that it's "fast". It's not fast. The whole TLB thing is slow as hell.

yeah. I noticed it some time ago that INVLPG is unreasonably slow.

My theory is that in the CPU it's perhaps a loop (in microcode?) over _all_ TLBs - so as TLB caches get larger, INVLPG gets slower and slower ...

The tlb already content-addresses entries when looking up translations, so it shouldn't be that bad.

invlpg does have to invalidate all the intermediate entries ("paging-structure caches"), and it does (obviously) force a tlb reload.

I seem to recall 50 cycles for invlpg, what do you characterize as unreasonably slow?

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/