Re: MAP_SHARED bizarrely slow
From: dean gaudet
Date: Thu Oct 28 2004 - 00:57:16 EST
On Wed, 27 Oct 2004, Andrew Morton wrote:
> I get the exact opposite, on a P4:
>
> vmm:/home/akpm/maptest> time ./mm-sharemmap
> ./mm-sharemmap 10.81s user 0.05s system 100% cpu 10.855 total
> vmm:/home/akpm/maptest> time ./mm-sharemmap
> ./mm-sharemmap 11.04s user 0.05s system 100% cpu 11.086 total
> vmm:/home/akpm/maptest> time ./mm-privmmap
> ./mm-privmmap 26.91s user 0.02s system 100% cpu 26.903 total
> vmm:/home/akpm/maptest> time ./mm-privmmap
> ./mm-privmmap 26.89s user 0.02s system 100% cpu 26.894 total
> vmm:/home/akpm/maptest> uname -a
> Linux vmm 2.6.10-rc1-mm1 #14 SMP Tue Oct 26 23:23:23 PDT 2004 i686 i686 i386 GNU/Linux
>
> It's all user time so I can think of no reason apart from physical page
> allocation order causing additional TLB reloads in one case. One is using
> anonymous pages and the other is using shmem-backed pages, although I can't
> think why that would make a difference.
you're experiencing the wonder of the L1 data cache on the P4 ... based on
its behaviour i'm pretty sure that early in the pipeline they use the
virtual address to match a virtual tag and procede with that data as if
it's correct. not until the TLB lookup and physical tag check many cycles
later does it realise that it's done something wrong and pull kill / flush
pipelines.
when you set up a virtual alias, like you have with the shared zero page,
it becomes very confused.
in fact if you do something as simple as a 4 element pointer-chase where
the cache lines for elt 0 and 2 alias, and cache lines for 1 and 3 alias
then you can watch some p4 take up to 3000 cycles per reference.
-dean
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/