one more small fix for 2.0.31

Harald Koenig (koenig@tat.physik.uni-tuebingen.de)
Wed, 27 Aug 1997 11:19:49 +0200


this patch fixes a real problem with the delay loop for at least
Pentium OverDrive CPUs, it's not intented to get more bogosity
or similar, so please read on below and pls test for other CPUs... !!

diff -ur /soft/linux/include/asm-i386/delay.h linux/include/asm-i386/delay.h
--- /soft/linux/include/asm-i386/delay.h Thu Jan 18 23:50:24 1996
+++ include/asm-i386/delay.h Wed Aug 27 10:47:08 1997
@@ -14,7 +14,7 @@
extern __inline__ void __delay(int loops)
{
__asm__ __volatile__(
- ".align 2,0x90\n1:\tdecl %0\n\tjns 1b"
+ ".align 8,0x90\n1:\tdecl %0\nnop\n\tjns 1b"
:/* no outputs */
:"a" (loops)
:"ax");

browsing through my own kernel tree again I found one small patch
which I'd really like to get into 2.0.31.

using my PentiumOverdrive PODP83 CPU, I get very different actual delays
from udelay() depending on the alignment where udelay() or __delay()
gets used. worst case errors are a factor of 2.5 !!

depending on the alignment of the __delay() routine (add some NOPs before it
e.g.) in the BogoMips measurement I get either 26, 33, 52.5 or 83 bogomips
(this depends e.g. on which kernel config options I use).

now if my actual kernel image determines 83 bogomips but udelay() is called
at a possition with alignment which would give 33 BMs, the delay is
2.5 times too long. the other way (2.5 times too short) might happen too
if the kernel BM calibration gets 33 BMs...

I've tested all possible shorter alignments and locations for the delay loop,
this one is the shortest possible solution which gives the same delay for
every possible position.

again: this patch changes the delay loop to to get more BogoMips [tm]
but to get consistent and reproducable delays!

below is my small test program which I've used to try this.
using the Pentium cycle counter it measures the actual delay
of udelay(100000) which should be 0.1s obviously.

but when using the original __delay() (compiled with -DORIG)
I get e.g. the following output (all delays in usec, they should
all be ~100000 !!) while the new delay loop (without -DORIG)
gives correct 100ms delays for all calls...

0 Calibrating delay loop.. ok - 26.16 BogoMIPS
32889.300
31459.458
31478.562
95411.986
79793.863
32595.742
79786.351
101308.965
31445.142
31468.614
32585.914
116704.523
80894.700
32570.866
79806.211
95420.866
32611.246
32679.191
32569.882
95531.578
79664.095
31445.562
79811.131
113522.674
31455.114
31469.442
32581.630
97884.836
78611.762
31462.758
79758.247
96601.862

Harald

begin 644 udelay-test.c.gz
M'XL(`._N`S0"`^U6;6_;-A#^;/^*FPH7DN<WR>B6VG$Q-,V:;LT2I!T&%`$$
M6J)LUK3H253B;.M^^^Y(V8W29F@_;6AI!+%YXAWOGN>YHX;=-G0!JI1+=N,'
MH'FI85.H1<'6D*D"SL^>G0/,6<E34/D$=QN'YN>59GG*I,HY/%4+=2HVY2Y(
M[?"T#@")2CF\%'FUA14O<BZM1>3X)_1PS40^2`##D9?($XFI#27M'YH<!\LZ
MXH^8W%H5Y(IYKID6&%U1&,V+3<&UR!>@EQP*7E92ESTH.3<&D^*+\U<4Y11/
M[9^<_?;Z#%*55&N>ZT%]P!4O2HPY@7`P!E,TJ_12%1/XB6<9O"X8G00^K>)Z
M]<.IT%P.CLY.`_08MMM#$RI1ZXV0'*Z%7KZ'+TF@?Q9!7X%.HG[.K_=XXGJ0
MW+=1%6(!_6=G%R^>[S?B20]2G@DDX.C\U_CTY`^`@_%@/![#L+OGL7\PMEMK
M7.&PU*E0@^636Z8J%VAMVK18\Z:EO"F'&X;\DKG-MUA[#E5>BD6./*,2%OA/
M;<IXPXNXY,ETGU_I;P-XL-T[Q;'(D5\>QW"E1(IKJT7DT88(VG_BP1FZ`]7<
M1D#BF)7K./:]`9-X(D2]T?;QZ#(/)Y<ZY8F$#BXN]=N\A'#N36#B,0]\&PU_
M;[T`\^&RY!\+=O"Q8+G:?$+$/!59^]V]I=5=UH2I0G1,C9B*^0W=&8RV(_R$
MH^2[::N%#$;=[CB"(80C\R$6;V>^KB2EZ;5;:)UXLQ1SJ^-:"]L;>MYHE_B.
MF_TF4T6K94/7N1JG*16%'.1$0AS_?'SQR_'+.-YSNBF0K97]RO;6MR++!"\A
MD2I9^<$.'^P);$-10KE4E4QASH%M<%IL(<+6[-KIH164FA4:_%QI;F:#8!)=
M1*:#'LT';"8I*=52XP^X5L4*^!7'`9"9[4S*&PRC0+)BP7L@M/&`MQ5..,U6
MJ$2D>ZEQ%_&`G8R@_HN"809^>'@81H@&E?":2L`_FBAYM9YC`)7!7.B2OG$`
M)8+&AVD^VM,(IO)T`'#,DB55@$XFHY+F%\.YPQ8<I\ZCX<D;L)M+W&T.]*58
MV2E&8T#D3`:4!`,IM)9&SW.N:2[I)<LA[%!5.T)>GK^*SR^.C^"@W3:"3%#Q
M\X)I7K--1I)BBYI/BV153NUO2AZSW*VHD'V!LWU<!*9U/V8MJQ+?.ZI/I0%M
MSC7A!P,P\KM>TJ1L*A0.#V<0!H"943=<,P2,</6L1E3F69&9G*GB5LMDC^?7
M(B1=UY'K)_M'*'\3%86VI)2ZPZD]YKD"3.J^:+L.:;;2],/-T-]#V4)MVO/-
MH?."LQ69WQE%/5/(XQPY+6[JCA#UM8;=L.#Z`PFA.#0]X[]7V!MT^QH42`1^
MM:$G3:9(G($IJ(GNDR>([M1:28NSN[-[!UTS7K\/#Q]:ILC+1`G`LM0\X*_9
M+05]#C73S\6>`-[A_LW,`A\0F76/VZZXI9,[F3Z<P=_O4[6\9-1F\@:G3FJ&
MA_$`]*B;DUY>L./6IO'K?D5;J3E+R61(P;$^;-T%'B?]R9LIG7&A*@Q$?7W%
M9,7-@#/]0G/+>.ZZ1ZU04!U9#3JC2%;[EYG+W.MA.4U(OHT>C4;!\)&Y-,SC
M>Y\'T*';Q8YZ.QWH7<RO;R;*A(4]%O72L)=&V,YHS$6"I/7#Q^B%2SO^?>\R
MAY'I97IUN3-AFCM'N+&#5T!G2^DW4NO=Y1;=#(W-\3+:1M\?I)C_U+S9X"86
MSE@T2\-9&LU"FZ=0&^F/`[O`*]/W\#JO$ZQ*R?G&IXO5&LSSP?P&;QV\A3-\
M&1B'>.%[,[I#68B7C[U?TS"8U"%L:?9N_I0@T?L@D0ER&Y-..!J,,\(#-_;Q
MP&']1O?Q_+_L<KZJZK[F8EWM#@J'C`/JOP7*X>9@_-_"Z%!U('^1(#O,'06.
M`L>((\@1Y/AR?#GZ')N.34>N(]=Q[;AVU#OJG1*<$IPPG#"<3IQ.W-+)QBV=
FBIR*G*B<J-S2:<PMG>2<Y)P"G0+=T@GRLP3YKOT/I<PF=@%```!.
`
end

--
All SCSI disks will from now on                     ___       _____
be required to send an email notice                0--,|    /OOOOOOO\
24 hours prior to complete hardware failure!      <_/  /  /OOOOOOOOOOO\
                                                    \  \/OOOOOOOOOOOOOOO\
                                                      \ OOOOOOOOOOOOOOOOO|//
Harald Koenig,                                         \/\/\/\/\/\/\/\/\/
Inst.f.Theoret.Astrophysik                              //  /     \\  \
koenig@tat.physik.uni-tuebingen.de                     ^^^^^       ^^^^^