character distribution of the kernel source

Rick Hohensee (humbubba@smarty.smart.net)
Tue, 5 Oct 1999 02:08:33 -0400 (EDT)


There's 7 meg of space characters in the Linux kernel sources. There's more
underscores than f's. There isn't a single 8 bit byte. The least common
printable character is backtick (or is that front tick?) , of which there
are 1523. The occurence of Arabic numerals shows the influence of powers of
two. 0 is more frequent than lowercase a, o, u or y.

Byte value occurence tabulation, Linux 2.2.10 source

decimal
hits byte ASCII %

7523094 32 13.68 %
2930679 101 e 5.33 %
2271934 116 t 4.13 %
1971246 105 i 3.59 %
1918466 9 [9] 3.49 %
1909551 10 [10] 3.47 %
1796391 114 r 3.27 %
1777110 115 s 3.23 %
1770556 110 n 3.22 %
1710770 48 0 3.11 %
1580561 97 a 2.87 %
1462395 111 o 2.66 %
1425171 95 _ 2.59 %
1355522 100 d 2.47 %
1189749 99 c 2.16 %
1109031 44 , 2.02 %
990823 102 f 1.80 %
988910 108 l 1.80 %
910563 117 u 1.66 %
849520 42 * 1.54 %
780628 112 p 1.42 %
766729 120 x 1.39 %
644934 41 ) 1.17 %
643712 40 ( 1.17 %
629126 109 m 1.14 %
614705 104 h 1.12 %
590351 45 - 1.07 %
579382 59 ; 1.05 %
486601 103 g 0.88 %
483938 98 b 0.88 %
455832 47 / 0.83 %
423949 49 1 0.77 %
419560 61 = 0.76 %
415109 69 E 0.75 %
403779 83 S 0.73 %
374249 118 v 0.68 %
362766 67 C 0.66 %
348504 73 I 0.63 %
342671 82 R 0.62 %
342527 65 A 0.62 %
340752 84 T 0.62 %
331720 50 2 0.60 %
325338 46 . 0.59 %
301916 62 > 0.55 %
280808 78 N 0.51 %
265274 121 y 0.48 %
263920 107 k 0.48 %
260661 68 D 0.47 %
258035 79 O 0.47 %
232678 76 L 0.42 %
223829 119 w 0.41 %
215502 80 P 0.39 %
208630 51 3 0.38 %
201790 77 M 0.37 %
198461 52 4 0.36 %
193545 56 8 0.35 %
191609 35 # 0.35 %
186512 70 F 0.34 %
177881 54 6 0.32 %
176211 34 " 0.32 %
158613 85 U 0.29 %
146334 66 B 0.27 %
138137 123 { 0.25 %
138103 125 } 0.25 %
134232 53 5 0.24 %
117864 71 G 0.21 %
117537 55 7 0.21 %
116815 58 : 0.21 %
105590 38 & 0.19 %
100201 57 9 0.18 %
97687 43 + 0.18 %
94650 37 % 0.17 %
82792 113 q 0.15 %
80519 72 H 0.15 %
79925 91 [ 0.15 %
79562 93 ] 0.14 %
76004 60 < 0.14 %
71032 92 \ 0.13 %
66224 88 X 0.12 %
66201 124 | 0.12 %
59746 86 V 0.11 %
59076 122 z 0.11 %
57434 75 K 0.10 %
52871 87 W 0.10 %
47396 39 ' 0.09 %
46416 89 Y 0.08 %
45675 33 ! 0.08 %
39007 36 $ 0.07 %
25550 81 Q 0.05 %
23706 106 j 0.04 %
18643 90 Z 0.03 %
12520 63 ? 0.02 %
11541 74 J 0.02 %
10797 126 ~ 0.02 %
9210 64 @ 0.02 %
4283 8 [8] 0.01 %
2838 94 ^ 0.01 %
1523 96 ` 0.00 %
444 12 [12] 0.00 %
223 0 [0] 0.00 %
129 4 [4] 0.00 %
116 16 [16] 0.00 %
110 1 [1] 0.00 %
109 2 [2] 0.00 %
105 14 [14] 0.00 %
104 24 [24] 0.00 %
103 11 [11] 0.00 %
102 7 [7] 0.00 %
100 3 [3] 0.00 %
98 6 [6] 0.00 %
92 20 [20] 0.00 %
90 5 [5] 0.00 %
85 28 [28] 0.00 %
82 17 [17] 0.00 %
81 19 [19] 0.00 %
80 18 [18] 0.00 %
79 15 [15] 0.00 %
79 13 [13] 0.00 %
69 27 [27] 0.00 %
68 30 [30] 0.00 %
66 21 [21] 0.00 %
64 25 [25] 0.00 %
62 23 [23] 0.00 %
60 26 [26] 0.00 %
58 29 [29] 0.00 %
54 31 [31] 0.00 %
54 22 [22] 0.00 %
36 127 [127] 0.00 %

Rick Hohensee

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/