Because that's how the TCP checksum works. It's very clever because it
can be written in the same fast way on little-endian and big-endian
machines, and adding 16 bits, 32 bits, 64 bits or even 128 bits at a
time.
May I direct you to the rfc1071 which is all about these things. I
quote:
"This simple checksum has a number of wonderful mathematical
properties that may be exploited to speed its calculation, as we
will now discuss."
> That might be the reason, yes.
> it might also be that only one load can be executed per cycle.
That's not true on a Pentium if you pick non-conflicting addresses.
Both instructions in a pair can read from cache.
(I forget the precise rule for conflicting addresses).
-- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/