Re: [PATCH 0/3] sysctl: detect overflows when setting integers

From: Alexey Dobriyan
Date: Tue Mar 31 2015 - 07:51:00 EST


On Mon, Mar 30, 2015 at 8:37 PM, Heinrich Schuchardt <xypron.glpk@xxxxxx> wrote:
> Hello Alexey,
>
> thank you for reviewing.
>
> On 30.03.2015 14:34, Alexey Dobriyan wrote:
>>> Unfortunately functions simple_strtoul and simple_strtoull cannot
>>> be replaced by kstrtoul and kstrtoull in some places, because they
>>> expect a zero terminated string instead of returning a pointer to
>>> the character after the last digit.
>>>
>>> This patch introduces two new functions kstrtoul_e and kstrtoull_e
>>> which fill this gap.
>>
>> Well, there were two ideas:
>> a) to convert first, see what's left and generalize it,
>> b) kstrtox() should be used only in one place --
>> parsing integers in proc/sysfs files.
>
> Neither a) nor b) are mentioned in your patch
> https://lkml.org/lkml/2011/2/26/52.
>
> How does a) apply to my patch series?

Conversion is far from finished, so people do not see clearly what
interface they want.

> The patch series is about parsing integers inside the /proc mount so
> what is the conflict with b)?
>
> vsprintf.c has the following comment for simple_strtol:
> * This function is obsolete. Please use kstrtol instead.
>
> Could you, please, elaborate why kstrox.c should not be used in other
> places? Should we duplicate these functions instead of reusing them?
>
>>
>> The functions can probably be replaced by sscanf() (I didn't look closely).
>
> No. sscanf does not return the end of the parsed string. It requires to
> know beforehand if the string contains a octal, decimal or hexadecimal
> number.

Explicit base is good point.
As for knowing end of string, how often do you really want to know it?

>> I hate "_e" suffix with passion.
>
> Which names would you appreciate?

parse_integer() :-)

>> C in 2015 doesn't have this arcane concept known as optional
>> parameters so I'd suggest to hack around with always returning
>> number of OK characters
>
> Which functions should return the number of OK characters?

kstrto*() family can return positive amount or new interface.
Obvious question -- what to do with newline.

>> OR add one (just one) more separate
>> interface
>>
>> unsigned int parse_integer(const char *s, unsigned int base, T *p);
>
> Why would you call this function parse_integer and not kstrtox?

Because C style is horrible: atoi, atol, atoll, strtol, stroll,
strtoul, strtoull, ...
Embedding type into name, meh.

>> and dispatching with __builtin_choose_expr().
>
> https://gcc.gnu.org/onlinedocs/gcc-3.4.5/gcc/Other-Builtins.html
> describes how to use __builtin_types_compatible_p together with
> __builtin_choose_expr().
>
> I do not understand how to avoid exposing the interfaces of the
> functions actually called by a parse_integer function multiplexer.

Well, there is one advertised interface -- parse_integer() and
2x(char, short, int, long, long long) = 12 technically public
but not advertised interfaces.

>> Of course if someone rewrites that abomination called
>> sysctl parsing from scratch, maybe none of this will be needed!
>
> Unfortunately you are very vague about what you dislike in sysctl.

It makes sense for proc_get_long() mentioned on one of your patches
to know length of parsed string only when sysctl is an array of integers
not just one integer. But most of the sysctls are just simple values.

Again, no one looked at what is really needed and you adding
"char **endp" because manpage says so just proves the point.

I suggest to scrap libc-style integer parsing and do it right.
Namely:

int parse_integer(const char *s, unsigned int base, T *val);

returns -E on error/overflow
returns (positive) number of character consumed otherwise.
*val is written only if there were no errors
accepts whitespace (space, tab only), sign '+', sign '-'.

Error checking can still remain

rv = parse_integer(...);
if (rv < 0)
...

because if very first character is invalid, -EINVAL is still returned.

Parsing multiple values becomes:

rv = parse_integer(s, base, &val1);
if (rv < 0)
return rv;
s += rv;
rv = parse_integer(s, base, &val2);
if (rv < 0);

OpenBSD people came up with strtonum() at some point
http://www.openbsd.org/cgi-bin/man.cgi/OpenBSD-current/man3/strtonum.3?query=strtonum&arch=i386
pandora box "errstr" and missing ULL parsing are obvious.

Maybe Plan 9 got it right?
Well, no, they are still in "multiplication overflow" phase:
http://plan9.bell-labs.com/sources/plan9/sys/src/ape/lib/ap/gen/strtoul.c
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/