Re: Some functions are not inlined by gcc 3.2, resulting code is ugly

From: Jussi Laako (jussi.laako@kolumbus.fi)
Date: Sun Nov 03 2002 - 16:28:05 EST


On Mon, 2002-11-04 at 02:17, Denis Vlasenko wrote:

> Alignment does not eliminate jump. It only moves jump target to 16 byte
> boundary.

Exactly. And P4 cache is _very_ bad at anything not 16-byte aligned. The
speed penalty is big. This seems to be problem only with Intel CPU's, no
such large effects on AMD ones.

> This _probably_ makes execution slightly faster but on average
> it costs you 7,5 bytes. This price is too high when you take into account
> L1 instruction cache wastage and current bus/core clock ratios.

7.5 bytes is not much compared to possibility of trashed cache or
pipeline flush.
Do you have execution time numbers of jump to 16-byte aligned address vs
unaligned address?

        - Jussi Laako



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Nov 07 2002 - 22:00:30 EST