Re: Speed of memcpy, csum_partial and csum_partial_copy

Linus Torvalds (torvalds@cs.helsinki.fi)
Sun, 9 Jun 1996 09:03:22 +0300 (EET DST)


On Sat, 8 Jun 1996, Robert L Krawitz wrote:
>
> 643 23.78% 00191324 csum_partial_copy_fromuser
> 997 36.88% 001369c8 memcpy_toiovec
> 2703 100.00% 00000000 total
>
> That's very, very interesting. Somehow the checksum routine was
> faster than the raw memcpy routine.

Yes. But there could be secondary stuff here that doesn't show up, like
the block being in the cache before entering the system call, but the TCP
processing fliushing the cache enough that by the time we copy it out to
user space we have nothing cached any more..

> (This is on a P166 with a reasonably good memory subsystem, and the
> machine was sending 500MB of data over TCP loopback).
>
> I presume that means EDO RAM. If so, I'm guessing that
> csum_partial_copy_fromuser was running at about 45 MB/sec, and hence
> your overall throughput was something like 11 MB/sec?

I actually get closer to 20MB/s over TCP loopback. It's EDO RAM and sync
burst cache etc. 20MB/s over TCP isn't bad on a machine that does memcpy
at 43MB/s..

Linus