[2.1.82] SERIOUS FLAWS in IEEE math handling on ALPHA.

Dominik Kubla (kubla@sundiver.zdv.uni-mainz.de)
Tue, 10 Feb 1998 23:45:38 +0100


[ NOTE: This is a cross-posting to several mailing lists, so please
respect the reply-to. Thanks! -dbk ]

Hi folks,

it appears as if there is a serious flaw in the IEEE math handling on
the Alpha. A colleague of mine complained to me about having serious
problems with his numerical programs getting SIGFPE at times, even so
he as compiled them with -mieee. That prompted me to investigate the
problem and i fetched the paranoia IEEE conformance test and ran it on
my UDB (using linux 2.1.82, egcs 1.0 and glibc 2.05c). As control we
ran the same test on an UDB running NetBSD 1.3 (using gcc 2.7.2.1) and
on an EB64-Clone running Digital Unix 4.0 (using Digitals c89).

The results were somewhat unexpected:

* Digital Unix was flawless. (As it should be, considering the price
tag.)

* NetBSD failed the Inf and NaN tests with SIGFPE, but was flawless
otherwise.

* Linux had 1 serious defect, 2 defects and 1 flaw (see the attached
output of paranoia).

I am now looking at the IEEE kernel code, hoping to find the bugs.
But somebody with more than an idea of what he is doing might actually
find it faster.

Until these bugs are fixed, i would advise people who need to rely on
the results of their numerical calculations to be very cautious when
using Linux/Alpha.

The paranoia outputs of OSF and NetBSD are available on request, should
need be.

Yours,
Dominik Kubla
Unix User Group ("UFO")
Johannes Gutenberg-University, Mainz

=============================== CUT HERE ================================

Lest this program stop prematurely, i.e. before displaying

`END OF TEST',

try to persuade the computer NOT to terminate execution when an
error like Over/Underflow or Division by Zero occurs, but rather
to persevere with a surrogate value after, perhaps, displaying some
warning. If persuasion avails naught, don't despair but run this
program anyway to see how many milestones it passes, and then
amend it to make further progress.

Answer questions with Y, y, N or n (unless otherwise indicated).

To continue, press RETURN
Diagnosis resumes after milestone Number 0 Page: 1

Users are invited to help debug and augment this program so it will
cope with unanticipated and newly uncovered arithmetic pathologies.

Please send suggestions and interesting results to
Richard Karpinski
Computer Center U-76
University of California
San Francisco, CA 94143-0704, USA

In doing so, please include the following information:
Precision: double;
Version: 10 February 1989;
Computer: DEC Multia (VX-42), Linux 2.1.82, glibc 2.0.5c

Compiler: egcs-1.0

Optimization level: -O3

Other relevant compiler options: -mieee -lm

To continue, press RETURN
Diagnosis resumes after milestone Number 1 Page: 2

Running this program should reveal these characteristics:
Radix = 1, 2, 4, 8, 10, 16, 100, 256 ...
Precision = number of significant digits carried.
U2 = Radix/Radix^Precision = One Ulp
(OneUlpnit in the Last Place) of 1.000xxx .
U1 = 1/Radix^Precision = One Ulp of numbers a little less than 1.0 .
Adequacy of guard digits for Mult., Div. and Subt.
Whether arithmetic is chopped, correctly rounded, or something else
for Mult., Div., Add/Subt. and Sqrt.
Whether a Sticky Bit used correctly for rounding.
UnderflowThreshold = an underflow threshold.
E0 and PseudoZero tell whether underflow is abrupt, gradual, or fuzzy.
V = an overflow threshold, roughly.
V0 tells, roughly, whether Infinity is represented.
Comparisions are checked for consistency with subtraction
and for contamination with pseudo-zeros.
Sqrt is tested. Y^X is not tested.
Extra-precise subexpressions are revealed but NOT YET tested.
Decimal-Binary conversion is NOT YET tested for accuracy.

To continue, press RETURN
Diagnosis resumes after milestone Number 2 Page: 3

The program attempts to discriminate among
FLAWs, like lack of a sticky bit,
Serious DEFECTs, like lack of a guard digit, and
FAILUREs, like 2+2 == 5 .
Failures may confound subsequent diagnoses.

The diagnostic capabilities of this program go beyond an earlier
program called `MACHAR', which can be found at the end of the
book `Software Manual for the Elementary Functions' (1980) by
W. J. Cody and W. Waite. Although both programs try to discover
the Radix, Precision and range (over/underflow thresholds)
of the arithmetic, this program tries to cope with a wider variety
of pathologies, and to say how well the arithmetic is implemented.

The program is based upon a conventional radix representation for
floating-point numbers, but also allows logarithmic encoding
as used by certain early WANG machines.

BASIC version of this program (C) 1983 by Prof. W. M. Kahan;
see source comments for more history.

To continue, press RETURN
Diagnosis resumes after milestone Number 3 Page: 4

Program is now RUNNING tests on small integers:
-1, 0, 1/2, 1, 2, 3, 4, 5, 9, 27, 32 & 240 are O.K.

Searching for Radix and Precision.
Radix = 2.000000 .
Closest relative separation found is U1 = 1.1102230e-16 .

Recalculating radix and precision
confirms closest relative separation U1 .
Radix confirmed.
The number of significant digits of the Radix is 53.000000 .

To continue, press RETURN
Diagnosis resumes after milestone Number 30 Page: 5

Subtraction appears to be normalized, as it should be.
Checking for guard digit in *, /, and -.
*, /, and - appear to have guard digits, as they should.

To continue, press RETURN
Diagnosis resumes after milestone Number 40 Page: 6

Checking rounding on multiply, divide and add/subtract.
Multiplication appears to round correctly.
Division appears to round correctly.
Addition/Subtraction appears to round correctly.
Checking for sticky bit.
Sticky bit apparently used correctly.

Does Multiplication commute? Testing on 20 random pairs.
No failures found in 20 integer pairs.

Running test of square root(x).
Testing if sqrt(X * X) == X for 20 Integers X.
Test for sqrt monotonicity.
sqrt has passed a test for Monotonicity.
Testing whether sqrt is rounded or chopped.
Square root appears to be correctly rounded.

To continue, press RETURN
Diagnosis resumes after milestone Number 90 Page: 7

Testing powers Z^i for small Integers Z and i.
... no discrepancis found.

Seeking Underflow thresholds UfThold and E0.

FLAW: Underflow can stick at an allegedly positive
value PseudoZero that prints out as 8.9003e-308 .
Since comparison denies Z = 0, evaluating (Z + Z) / Z should be safe.
What the machine gets for (Z + Z) / Z is 2.00000000000000000e+00 .
This is O.K., provided Over/Underflow has NOT just been signaled.
Smallest strictly positive number found is E0 = 2.22507e-308 .
Since comparison denies Z = 0, evaluating (Z + Z) / Z should be safe.
What the machine gets for (Z + Z) / Z is 2.00000000000000000e+00 .
This is O.K., provided Over/Underflow has NOT just been signaled.

To continue, press RETURN
Diagnosis resumes after milestone Number 120 Page: 8

The Underflow threshold is 2.22507385850720138e-308, below which
calculation may suffer larger Relative error than merely roundoff.
Since underflow occurs below the threshold
UfThold = (2.00000000000000000e+00) ^ (-1.02200000000000000e+03)
only underflow should afflict the expression
(2.00000000000000000e+00) ^ (-1.02200000000000000e+03);
actually calculating yields: 0.00000000000000000e+00 .
This computed value is O.K.

Testing X^((X + 1) / (X - 1)) vs. exp(2) = 7.38905609893065218e+00 as X -> 1.
Accuracy seems adequate.
Testing powers Z^Q at four nearly extreme values.
... no discrepancies found.

To continue, press RETURN
Diagnosis resumes after milestone Number 160 Page: 9

Searching for Overflow threshold:
This may generate an error.
Can `Z = -Y' overflow?
Trying it on Y = -8.98846567431157954e+307 .
Seems O.K.
SERIOUS DEFECT: overflow past -8.98846567431157954e+307
shrinks to -0.00000000000000000e+00 .
Overflow threshold is V = 8.98846567431157954e+307 .
Overflow saturates at V0 = 8.98846567431157954e+307 .
No Overflow should be signaled for V * 1 = 8.98846567431157954e+307
nor for V / 1 = 8.98846567431157954e+307 .
Any overflow signal separating this * from the one
above is a DEFECT.

DEFECT: Comparison alleges that what prints as Z = 2.22507385850720138e-308
is too far from sqrt(Z) ^ 2 = 1.16204282018886510e-307 .
DEFECT: Comparison alleges that what prints as Z = 2.22507385850720138e-308
is too far from sqrt(Z) ^ 2 = 1.16204282018886510e-307 .

To continue, press RETURN
Diagnosis resumes after milestone Number 190 Page: 10

What message and/or values does Division by Zero produce?
This can interupt your program. You can skip this part if you wish.
Do you wish to compute 1 / 0?
Trying to compute 1 / 0 produces ... Inf .

Do you wish to compute 0 / 0?
Trying to compute 0 / 0 produces ... NaN .

To continue, press RETURN
Diagnosis resumes after milestone Number 220 Page: 11

The number of SERIOUS DEFECTs discovered = 1.
The number of DEFECTs discovered = 2.
The number of FLAWs discovered = 1.

The arithmetic diagnosed has unacceptable Serious Defects.
END OF TEST.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu