Re: 2.1 kernel bloat revisited

David S. Miller (davem@jenolan.rutgers.edu)
Mon, 31 Mar 1997 20:24:26 -0500


Date: Mon, 31 Mar 1997 09:43:28 -0500
From: "Theodore Y. Ts'o" <tytso@MIT.EDU>

What percentage of the time spent in a TCP open is due to
SHATransform? And are there boxes which are doing enough TCP opens
that this is cuasing a measurable difference?

When a machine gets hit with 800 or so web operations per second, 6%
of the total CPU time of the entire system is in SHATransform. This
drives me nuts, because outside of the tcp queueing decision bugs we
have in 2.1.x, it is one of the major things preventing us from
hitting "big league" web performance numbers (ie. SGI can get 1200
conns/second on similar hardware).

I think it is rediculious to have such a number cruncher in a
critical code path. This is the primary motivation behind Eric and
myself searching for some way to make this sequence number creation
several orders of magnitude faster yet still retain the secure
properties of the current code as best as possible.

There are some hand coded SHA implementations in assembly that I
haven't really bothered with because SHATransform isn't part of the
critical path for /dev/random. But if this is really a concern,
there are SHA implementations which are smaller and faster than the
current "big and fast" version which is written in C.

It is a bottleneck in the TCP code.

---------------------------------------------////
Yow! 11.26 MB/s remote host TCP bandwidth & ////
199 usec remote TCP latency over 100Mb/s ////
ethernet. Beat that! ////
-----------------------------------------////__________ o
David S. Miller, davem@caip.rutgers.edu /_____________/ / // /_/ ><