improved num_digits() function for x86/lib/misc.c

From: Jason Quinn
Date: Thu Jan 26 2023 - 08:40:31 EST

Next message: Rob Herring: "Re: [PATCH] dt-bindings: usb: samsung,exynos-dwc3: allow unit address in DTS"
Previous message: Linus Walleij: "Re: [PATCH 0/2] MediaTek pinctrl cleanups and improvements"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Dear X86 Architecture (32-bit and 64-bit) Maintainers,

While diving into smpboot.c source to investigate what I believe to be a bug
that causes listings of CPU numbers to show up as kernel warnings during boot,
I also checked out the source for lib/misc.c that houses the
"int num_digits(int val)" function. It seems kernel/smpboot.c is the only file
in the linux kernel that directly uses this function. The implementation used
for num_digits() has multiple issues. First there are some bugs:

BUG A: For large inputs, the function returns incorrect results, if it returns
at all. For example, on my amd64 machine, num_digits(1500000000) incorrectly
returns 16 instead of 10 and the function may simply get stuck in an infinite
loop as it does with num_digits(2000000000). Both inputs are valid 4-byte
integers so this should not occur. This problem is triggered whenever the input
argument has the maximum number of digits of an int variable ( abs(val) >= 1e9
for 4-byte ints). The faulty behavior occurs because the local variable "m"
overflows in the while loop for these large (but valid) inputs.

It seems this function is only used to printk() some core and cpu numbers.
Admittedly, a billion is still far less than the number of core in the largest
supercomputer today (The Andromeda's 15.5 million) but this function should
be functionally correct for any input for its domain, especially since
somebody someday might trustingly use it for other purposes than info messages.
In other words, this is a bug that should be fixed despite it not currently
causing a practical problem in the SMP code. The good news is that the function
itself can be made faster while also extending correctness.

BUG B: For the specific case of num_digits(INT_MIN), the function may enter an
infinite loop. This occurs because an -INT_MIN calculation, which the
implementation performs, may be undefined behavior depending on architecture.
In particular, -INT_MIN is undefined for two's complement signed integers,
which includes almost all computers nowadays. [NOTE: the upcoming C2x standard
will REQUIRE the two's complement representation of signed ints.] Compilers
will often treat -INT_MIN as equal to INT_MIN (see gcc's -ftrapv and -fwrapv
options) in which case the line "val=-val" doesn't actually change the sign and
subsequently the function gets trapped in a loop. We must, therefore, handle
an input of INT_MIN as a special case.

IMPROVEMENT: The current implementation of num_digits() (reproduced now):

int num_digits(int val)
{
int m = 10;
int d = 1;

if (val < 0) {
d++;
val = -val;
}

while (val >= m) {
m *= 10;
d++;
}

return d;
}

is not only buggy but slow, relying on multiplication operations. It is possible
to have code of no less clarity (I say even clearer) and without the
multiplication. Here's my sketch of an improved version that assumes
(INT_MIN=2147483648 or INT_MIN=-2147483647) and INT_MAX=2147483648:

int num_digits(int val)
{
int d=0;

if ( val == INT_MIN ) return 11; /* 10 + extra for minus sign */

if ( val < 0 ) {
val=-val;
d++;
}

if ( val < 1e1 ) return d+1;
if ( val < 1e2 ) return d+2;
if ( val < 1e3 ) return d+3;
if ( val < 1e4 ) return d+4;
if ( val < 1e5 ) return d+5;
if ( val < 1e6 ) return d+6;
if ( val < 1e7 ) return d+7;
if ( val < 1e8 ) return d+8;
if ( val < 1e9 ) return d+9;

return d+10;
}

This code is basically just a loop unroll and the INT_MIN special case
that gets discovered during routine testing. My quick tests suggests it has
about a 2x speed-up. The binary size will be about 10% bigger but that's
offset by it not suffering from the two bugs. The only question
remaining is if to handle INT_MIN and INT_MAX more generally and if so by
how much. I have yet to come up with a clever version that generalizes without
starting to make the code ugly but it's probably possible. What set of values
must we want to handle for INT_MIN and what set of values must we handle for
INT_MAX? Are just INT_MIN = -2147483648 and INT_MAX = 2147483647 sufficient?

There's a patch attached below. Please test and modify as you like if
you wish to merge.

Cheers,
Jason

--- misc.c 2023-01-26 21:00:02.115701806 +0800
+++ misc_new.c 2023-01-26 21:04:33.059488195 +0800
@@ -1,4 +1,5 @@
// SPDX-License-Identifier: GPL-2.0
+#include <limits.h>
#include <asm/misc.h>

/*
@@ -8,17 +9,31 @@
*/
int num_digits(int val)
{
- int m = 10;
- int d = 1;
+ int d=0; /* initialize minus sign flag */

- if (val < 0) {
- d++;
- val = -val;
- }
+ /* Special case: -INT_MIN may be undefined behavior on some
+ architectures like those that use two's complement. Compilers often
+ implement -INT_MIN such that it equals INT_MIN which mean the
+ val=-val statement below would not actually reverse the sign. */
+ if ( val == INT_MIN ) return 11; /* Assuming INT_MIN=-2147483648
(10 digits plus 1 for the minus sign) */

- while (val >= m) {
- m *= 10;
- d++;
+ if ( val < 0 ) {
+ val=-val;
+ d++; // turn on minus sign flag
}
- return d;
+
+ if ( val < 1e1 ) return d+1;
+ if ( val < 1e2 ) return d+2;
+ if ( val < 1e3 ) return d+3;
+ if ( val < 1e4 ) return d+4;
+ if ( val < 1e5 ) return d+5;
+ if ( val < 1e6 ) return d+6;
+ if ( val < 1e7 ) return d+7;
+ if ( val < 1e8 ) return d+8;
+ if ( val < 1e9 ) return d+9;
+
+ /* Assuming INT_MAX=2147483647, at this point 1e9 <= val <= INT_MAX
+ so all possible values contain ten digits. We can return without
+ another if-statement because val <= INT_MAX is always true. */
+ return d+10;
}

Next message: Rob Herring: "Re: [PATCH] dt-bindings: usb: samsung,exynos-dwc3: allow unit address in DTS"
Previous message: Linus Walleij: "Re: [PATCH 0/2] MediaTek pinctrl cleanups and improvements"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]