Re: [patch] uninline init_waitqueue_*() functions

From: Ingo Molnar
Date: Wed Jul 05 2006 - 17:47:55 EST



* Linus Torvalds <torvalds@xxxxxxxx> wrote:

> On Wed, 5 Jul 2006, Ingo Molnar wrote:
> >
> > i had CONFIG_DEBUG_INFO (and UNWIND_INFO) disabled in all these build
> > tests.
>
> Good, because I just verified: those two together will on their own
> increase "text size" by about 17% for me.
>
> I still think Andrew is right: I don't see how an initializer that
> should basically be three instructions can possibly be 35 bytes larger
> than a function call that should be a minimum of two instructions
> (argument setup in %eax and the actual call - and that's totally
> ignoring the deleterious effects of a function call on register
> liveness).
>
> The fact that with allnoconfig the kernel is _smaller_ (but, quite
> franlkly, within the noise) with the inlined version would seem to
> back up Andrews position that it really shouldn't matter.

well, the allnoconfig thing is artificial (and the uninteresting) for a
number of reasons:

- it has REGPARM disabled which penalizes function calls

- it's UP and hence the inlining cost of init_wait_queue_head() is
significantly smaller.

- allnoconfig has smaller average function size - increasing the cost of
uninlining

> So I'm left wondering why it matters for you, and what triggers it.
> Maybe there is some secondary issue that could show us an even more
> interesting optimization (or some compiler behaviour that we should
> try to encourage).

yeah, i'd not want to skip over some interesting and still unexplained
effect either, but 35 bytes isnt all that outlandish and from everything
i've seen it's a real win. Here is an actual example:

c0fb6137: c7 44 24 08 00 00 00 movl $0x0,0x8(%esp)
c0fb613e: 00
c0fb613f: c7 44 24 08 01 00 00 movl $0x1,0x8(%esp)
c0fb6146: 00
c0fb6147: c7 43 60 00 00 00 00 movl $0x0,0x60(%ebx)
c0fb614e: 8b 44 24 08 mov 0x8(%esp),%eax
c0fb6152: 89 43 5c mov %eax,0x5c(%ebx)
c0fb6155: 8d 43 64 lea 0x64(%ebx),%eax
c0fb6158: 89 40 04 mov %eax,0x4(%eax)
c0fb615b: 89 43 64 mov %eax,0x64(%ebx)

versus:

c0fb070e: 8d 43 5c lea 0x5c(%ebx),%eax
c0fb0711: e8 94 98 18 ff call c0139faa <init_waitqueue_head>

so 39 bytes versus 8 bytes - 31 bytes saved. It's a similar win in other
cases i checked too. (the only exception is for smaller functions that i
mentioned before: where the parameters are not pre-calculated yet so
there's no good integration for the function call. In that case it's
break even, or in some cases a 3-4 bytes loss.)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/