Re: [PATCH RFC 2/2] [x86] Optimize copy_page by re-arranging instruction sequence and saving register

From: George Spelvin
Date: Fri Oct 12 2012 - 17:02:49 EST


Here are some Phenom results for that benchmark. The average time
increases from 700 to 760 cycles (+8.6%).

vendor_id : AuthenticAMD
cpu family : 16
model : 2
model name : AMD Phenom(tm) 9850 Quad-Core Processor
stepping : 3
microcode : 0x1000083
cpu MHz : 2500.210
cache size : 512 KB
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs hw_pstate npt lbrv svm_lock
bogomips : 5000.42
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64

copy_page_org copy_page_new
TPT: Len 4096, alignment 0/ 0: 678 760
TPT: Len 4096, alignment 0/ 0: 710 760
TPT: Len 4096, alignment 0/ 0: 710 760
TPT: Len 4096, alignment 0/ 0: 710 760
TPT: Len 4096, alignment 0/ 0: 710 760
copy_page_org copy_page_new
TPT: Len 4096, alignment 0/ 0: 667 760
TPT: Len 4096, alignment 0/ 0: 673 760
TPT: Len 4096, alignment 0/ 0: 710 760
TPT: Len 4096, alignment 0/ 0: 710 760
TPT: Len 4096, alignment 0/ 0: 710 760
copy_page_org copy_page_new
TPT: Len 4096, alignment 0/ 0: 667 760
TPT: Len 4096, alignment 0/ 0: 673 760
TPT: Len 4096, alignment 0/ 0: 710 760
TPT: Len 4096, alignment 0/ 0: 710 760
TPT: Len 4096, alignment 0/ 0: 710 760
copy_page_org copy_page_new
TPT: Len 4096, alignment 0/ 0: 671 760
TPT: Len 4096, alignment 0/ 0: 673 760
TPT: Len 4096, alignment 0/ 0: 671 760
TPT: Len 4096, alignment 0/ 0: 709 760
TPT: Len 4096, alignment 0/ 0: 708 760
copy_page_org copy_page_new
TPT: Len 4096, alignment 0/ 0: 667 760
TPT: Len 4096, alignment 0/ 0: 710 760
TPT: Len 4096, alignment 0/ 0: 710 760
TPT: Len 4096, alignment 0/ 0: 710 760
TPT: Len 4096, alignment 0/ 0: 710 760
copy_page_org copy_page_new
TPT: Len 4096, alignment 0/ 0: 671 760
TPT: Len 4096, alignment 0/ 0: 710 760
TPT: Len 4096, alignment 0/ 0: 710 760
TPT: Len 4096, alignment 0/ 0: 710 760
TPT: Len 4096, alignment 0/ 0: 710 760
copy_page_org copy_page_new
TPT: Len 4096, alignment 0/ 0: 678 760
TPT: Len 4096, alignment 0/ 0: 709 758
TPT: Len 4096, alignment 0/ 0: 710 760
TPT: Len 4096, alignment 0/ 0: 709 759
TPT: Len 4096, alignment 0/ 0: 710 760
copy_page_org copy_page_new
TPT: Len 4096, alignment 0/ 0: 680 760
TPT: Len 4096, alignment 0/ 0: 710 760
TPT: Len 4096, alignment 0/ 0: 710 760
TPT: Len 4096, alignment 0/ 0: 710 760
TPT: Len 4096, alignment 0/ 0: 710 760
copy_page_org copy_page_new
TPT: Len 4096, alignment 0/ 0: 667 760
TPT: Len 4096, alignment 0/ 0: 710 760
TPT: Len 4096, alignment 0/ 0: 709 760
TPT: Len 4096, alignment 0/ 0: 709 759
TPT: Len 4096, alignment 0/ 0: 710 760
copy_page_org copy_page_new
TPT: Len 4096, alignment 0/ 0: 678 760
TPT: Len 4096, alignment 0/ 0: 710 760
TPT: Len 4096, alignment 0/ 0: 710 760
TPT: Len 4096, alignment 0/ 0: 710 760
TPT: Len 4096, alignment 0/ 0: 710 760
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/