Re: 'Mega patch 2.1.14#5'

Samuel S Chessman (chessman@wauug.erols.com)
Mon, 9 Dec 1996 19:17:45 -0500 (EST)


I found and applied 2.1.14-megapatch6 at the cabi.net;
there was a missing include of linux/kgdb.h in kernel/sys.c to
null define the breakpoint function when kgdb is disabled, but clean
otherwise for my config.

The SMP works well on P6DOF, giving better performance than 2.0.26 SMP,
except for process creation and conncurrent shell scripts/

In both 2.0 and 2.1 series, SMP is about 70% of single CPU performance
overall, with the major difference in 8 concurrent shell script benchmark,
where SMP truly shines with 140% of single cpu performance.

Summary of Unixbench 4.0 on 2.0.26 and 2.1.14 Megapatch#6
Notes:
baseline index is 10.0 on SS20-60.
Higher numbers are faster.
Unixbench 4.0 is available at
ftp://linux.wauug.org/ftp/pub/bench/unixbench-4.0-DELTA.tgz

rh-128M-2CPU-LX2.1.14.961209

TEST BASELINE RESULT INDEX

Arithmetic Test (type = double) 29820.0 52816.5 17.7
Dhrystone 2 using register variables 116700.0 320194.7 27.4
Execl Throughput 43.0 228.9 53.2
File Copy 1024 bufsize 2000 maxblocks 3960.0 26404.0 66.7
File Copy 256 bufsize 500 maxblocks 1655.0 12077.0 73.0
File Copy 4096 bufsize 8000 maxblocks 5800.0 30985.0 53.4
Pipe Throughput 12440.0 78503.4 63.1
Pipe-based Context Switching 4000.0 16497.7 41.2
Process Creation 126.0 906.2 71.9
Shell Scripts (8 concurrent) 6.0 83.0 138.3
System Call Overhead 15000.0 51661.8 34.4
=========
FINAL SCORE 50.9

rh-128M-2CPU-LX2.0.26.961126

TEST BASELINE RESULT INDEX

Arithmetic Test (type = double) 29820.0 52828.0 17.7
Dhrystone 2 using register variables 116700.0 321748.4 27.6
Execl Throughput 43.0 236.8 55.1
File Copy 1024 bufsize 2000 maxblocks 3960.0 23807.0 60.1
File Copy 256 bufsize 500 maxblocks 1655.0 10030.0 60.6
File Copy 4096 bufsize 8000 maxblocks 5800.0 30463.0 52.5
Pipe Throughput 12440.0 57406.8 46.1
Pipe-based Context Switching 4000.0 13050.2 32.6
Process Creation 126.0 1107.8 87.9
Shell Scripts (8 concurrent) 6.0 86.3 143.8
System Call Overhead 15000.0 36945.6 24.6
=========
FINAL SCORE 46.9

For comparison, single processor configuration on same hardware

rh3-128M-1CPU-LX2.1.10.961115

TEST BASELINE RESULT INDEX

Arithmetic Test (type = double) 29820.0 52429.5 17.6
Dhrystone 2 using register variables 116700.0 319332.9 27.4
Execl Throughput 43.0 260.1 60.5
File Copy 1024 bufsize 2000 maxblocks 3960.0 31529.0 79.6
File Copy 256 bufsize 500 maxblocks 1655.0 17348.0 104.8
File Copy 4096 bufsize 8000 maxblocks 5800.0 34234.0 59.0
Pipe Throughput 12440.0 114065.9 91.7
Pipe-based Context Switching 4000.0 50020.3 125.1
Process Creation 126.0 2579.7 204.7
Shell Scripts (8 concurrent) 6.0 59.0 98.3
System Call Overhead 15000.0 96996.5 64.7
=========
FINAL SCORE 70.5

rh-128M-1CPU-LX2.0.24.961101

TEST BASELINE RESULT INDEX

Arithmetic Test (type = double) 29820.0 52757.8 17.7
Dhrystone 2 using register variables 116700.0 321394.1 27.5
Execl Throughput 43.0 259.0 60.2
File Copy 1024 bufsize 2000 maxblocks 3960.0 28903.0 73.0
File Copy 256 bufsize 500 maxblocks 1655.0 14033.0 84.8
File Copy 4096 bufsize 8000 maxblocks 5800.0 33552.0 57.8
Pipe Throughput 12440.0 81685.9 65.7
Pipe-based Context Switching 4000.0 37619.2 94.0
Process Creation 126.0 2500.8 198.5
Shell Scripts (8 concurrent) 6.0 58.7 97.8
System Call Overhead 15000.0 62173.5 41.4
=========
FINAL SCORE 62.0

While there is clearly room in the SMP architecture to improve
performance, as the single cpu benchmarks leave lots of headroom, I am
still genuinely awestruck at the increases in 2.1. Where does Linus find
the cycles?

As another point of reference, the Ultra on my desk, gcc 2.7.2,
SunOS rosie 5.5.1 Generic sun4u sparc SUNW,Ultra-1 clock 143 MHz

INDEX VALUES
TEST BASELINE RESULT INDEX

Arithmetic Test (type = double) 29820.0 34514.4 11.6
Dhrystone 2 using register variables 116700.0 258164.4 22.1
Execl Throughput 43.0 75.3 17.5
File Copy 1024 bufsize 2000 maxblocks 3960.0 1317.0 3.3
File Copy 256 bufsize 500 maxblocks 1655.0 2143.0 12.9
File Copy 4096 bufsize 8000 maxblocks 5800.0 1260.0 2.2
Pipe Throughput 12440.0 37144.8 29.9
Pipe-based Context Switching 4000.0 11952.6 29.9
Process Creation 126.0 331.2 26.3
Shell Scripts (8 concurrent) 6.0 16.0 26.7
System Call Overhead 15000.0 41970.2 28.0
=========
FINAL SCORE 14.8

Sam Chessman (SSC3) chessman@wauug.erols.com