Re: [patch] speed up / fix the new generic semaphore code (fix AIM740% regression with 2.6.26-rc1)
From: Linus Torvalds
Date:  Thu May 08 2008 - 19:14:53 EST
On Thu, 8 May 2008, Linus Torvalds wrote:
> 
> Btw, sparse will complain about those, because the source code *looks* 
> really cheap.
Sometimes you can fix it.
For example, this change:
	-       if (pte_present(*pte) && page_to_pfn(page) == pte_pfn(*pte)) {
	+       if (pte_present(*pte) && page == pfn_to_page(pte_pfn(*pte))) {
can simplify things: instead of moving from a 'struct page' to a pfn, it 
moves from a pfn to a 'struct page', and that is generally cheaper 
(multiply rather than divide by size of struct page). It's not always the 
same thing to do, but I think in this case we can. For me, the code 
generation changes:
	-       movabsq $7905747460161236407, %rdx      #, tmp111
	-       movabsq $32985348833280, %rax   #, tmp107
	-       leaq    (%r12,%rax), %rax       #, tmp106
	-       sarq    $3, %rax        #, tmp106
	-       imulq   %rdx, %rax      # tmp111, tmp106
	-       movabsq $70368744177663, %rdx   #, tmp113
	-       andq    %rdx, %rcx      # tmp113, pte$pte
	-       shrq    $12, %rcx       #, pte$pte
	-       cmpq    %rcx, %rax      # pte$pte, tmp106
	+       movabsq $70368744177663, %rax   #, tmp107
	+       andq    %rax, %rdx      # tmp107, pte$pte
	+       shrq    $12, %rdx       #, pte$pte
	+       imulq   $56, %rdx, %rax #, pte$pte, tmp109
	+       movabsq $-32985348833280, %rdx  #, tmp111
	+       addq    %rdx, %rax      # tmp111, tmp110
	+       cmpq    %rax, %r13      # tmp110, page
which isn't a *huge* deal, but it certainly looks better. One less big 
constant, and one less shift.
It's not going to make a huge difference, though. That function is just 
called too much, and it would still be entirely data-dependent all the way 
through.
			Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/