Re: checksumming with mmx, comment in arch/i386/lib/mmx.c

From: Valdis.Kletnieks@vt.edu
Date: Tue Feb 11 2003 - 14:06:42 EST


On Tue, 11 Feb 2003 13:37:07 EST, dank@suburbanjihad.net (nick black) said:

> firstly, to what domain of checksums does this comment apply? secondly,
> why is it true? it seems the PADDW family of instructions could work
> well here; is the slowdown a result of the kernel's need to muck with
> fpu state (from what i can tell, mmx uses the fp registers)?

(Note - second-hand info from somebody else who looked at MMX/SSE to optimize
an inner loop. Double-check with CPU documentation).

There's a big "urp" sound as the processor switches from FP to MMX mode and
back, which apparently takes a large number of cycles. You can to some extent
amortize this if you're switching once for a LONG loop (the analysis I saw was
with a million or so pixels on a screen image) - if you're switching in and out
for a 1500 byte packet (or even worse, a 100-byte packet) the impact may be
more noticable. You may wish to examine the SSE/SSE2 opcodes, which apparently
don't take this performance hit.

-- 
				Valdis Kletnieks
				Computer Systems Senior Engineer
				Virginia Tech


- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Feb 15 2003 - 22:00:35 EST