Re: [PATCH] Re-implemented i586 asm AES (updated)

From: dean gaudet
Date: Mon Aug 09 2004 - 14:44:52 EST


On Fri, 6 Aug 2004, Linus Torvalds wrote:

> work to first check that the FP context is safed. Even _without_ the extra
> work I simply cannot imagine that a "movd reg,mmx" is faster than a plain
> "movl reg,stackslot".

p3/p-m have datapaths specifically for movd -- apparently a path for
int->mmx and a separate path for mmx->int. the paths are latency 1,
throughput one per clock.

however that line of x86 processors is essentially unique in this
respect...

my measurement methodology is to pair a bunch of movd back and forth
between (int,mmx) register pairs, then measure the avg latency per
round-trip as i increase the number of pairs (i.e. to see how much
parallelism is available.)

i.e. for 2 in parallel my code is this fragment repeated 100 times:

movd %eax,%mm0
movd %ebx,%mm1
movd %mm0,%eax
movd %mm1,%ebx

here's the average round-trip latency (in cycles) per movd/movd pair
for various processors i have access to, and availble parallelisms:

available parallelism
1 2 3 4 5 6
p-m 2.00 1.30 1.00 1.04 1.01 1.00
p4-2 8.00 4.05 2.67 2.50 2.40 2.33
p4-3 11.02 5.50 3.67 2.75 2.40 2.33
opteron 14.67 6.00 5.33 5.00 5.00 5.00
efficeon 7.06 3.55 2.80 2.97 2.95 3.08

i'm pretty sure the gladman code was tuned on processors like
p3/p-m... where movd is an improvement over using memory. actually i've
been kind of puzzled why nobody ever used mmx for the XORing on a p3
-- i think the 1 cycle latency mmx plus the 1 cycle latency movd makes
it all pay off fine... but this won't pay off anywhere else because of
2-cycle mmx latency, and long latency between int/mmx register files.

i've seen another totally slimy AES trick for this problem -- store %esp
in a global memory location then use it as a general purpose register.
there's a benchmark which does this, i consider it wholly inappropriate
for general code though :)

-dean
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/