Re: 2.6.13/14 x86 Makefile - Pentiums penalized ?

From: Denis Vlasenko
Date: Wed Sep 14 2005 - 02:52:15 EST


> > > It's documented as being suboptimal to use inc/dec due to it modifying all
> > > of eflags resulting in dependency related stalls. add/sub only modifies
> > > one bit of eflags so is more optimal. However there is a problem of
> >
> > ?! add/sub doesn't modify "only one bit in eflags", it modifies all.
> > In fact, it's dec/inc which does not modify all bits.
> > It doesn't touch 'carry' bit (IIRC).
> >
> > If inc/dec is slower on P4, it must be just another P4 quirk.
>
> You're right about the add and the number of modified bits. The documented
> part is found in the P4 optimisation manual;
>
> "The inc and dec instructions should always be avoided. Using add
> and sub instructions instead avoids data dependence and improves
> performance."

I read this as "get saner processor".

But frankly any CPU optimization manuals, not just Intel's,
go to semi-absurd suggestions like "align all data
and branch targets to $BIGNUM bytes", "do not use instruction X
(because we failed to make it work reasonably fast), use Y".

Of course next gen CPU will prefer Y over X
(for example, AMD's recomended way to do NOPs).
--
vda
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/