On Tue, 4 February 2003 14:11:56 +0100, Helge Hafting wrote:
> Looks like a cacheline alignment issue to me.
> This loop of yours occupy x cachelines on your cpu,
> moving it in memory by adding the printf
> might cause it to ocupy x+1 cachelines.
> That might be noticeable if x is a really small number,
> such as 1.
Makes a lot of sense.
> My advice is to put your test loop in a function of its own,
> and do the printing in the function that calls it.
> functions are always aligned the same (good) way so
> that calling them will be fast.
> You can tune the speed of your inner loop by experimenting
> with the insertion of one or more NOP asms in front
> of the loop. Just be aware that all such tuning is wasted once
> you change anything at all in that function - you'll have to
> re-do the tuning each time.
> The compiler should ideally align the loops for maximum performance.
> That can be hard though, considering all the different processors
> that might run your program. And aligning everything optimally
> could waste a _lot_ of code space - so do this only for
> small loops with lots of iterations.
The compiler has a hard time to identify those loops that affect
performance as opposed to those that are run 2-3 times.
But the developer can usually profile and figure out, where those
loops are. I wonder if the following would be possible.
include/linux/cache.h appears to define such for data structures, but
not for code.
-- ticks = jiffies; while (ticks == jiffies); ticks = jiffies; -- /usr/src/linux/init/main.c - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to firstname.lastname@example.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Fri Feb 07 2003 - 22:00:14 EST