Re: [PATCH] Fix prctl(PR_GET_NAME) to not leak random trailing bytes

From: Linus Torvalds
Date: Sat Aug 28 2021 - 14:04:12 EST


On Fri, Aug 27, 2021 at 5:19 AM Helge Deller <deller@xxxxxx> wrote:
>
> Oh, the parisc strncpy() asm does NOT zero-fill the target address !!
> That's the bug.
> I thought strncpy would just copy up to given number of chars.

Yeah, strncpy() is a horrible horrible function.

It doesn't copy any NUL character at all if the source length was
longer than the 'n', so it doesn't necessarily result in a
NUL-terminated string.

And it fills the target with NUL characters at the end, which is
usually horribly inefficient and means that you should never ever use
it with big destination buffers. So it can be a surprisingly bad
function to use for the - not all that unusual situation - where you
have lots of space in the destination buffer, and use the 'n' version
of strcpy() just to be safe for odd situations. It will zero-fill all
that space, usually for no good reason.

Very different from the other 'n' functions like snprintf() and
friends that people use for that same "destination buffer safety"
reason.

So it's almost never the right thing to use, even if it's the most
traditional, most common and - by name - most obvious "copy string of
at most length 'n'" function.

It so happens that "comm[]" is probably one of *very* few situations
in the kernel where we really do want to use strncpy(), and where we
don't just NUL-terminate, but NUL-fill the buffer.

Of course, that "comm[]" behavior is unusual these days, but I think
it was a lot more usual back in the early 70's, when that whole
"small, fixed-size string buffer" model was much much more common than
it is today.

It is, after all, the exact same reason why the C language linker
rules for identifiers were historically "only the first 7 letters are
necessarily significant". Because "use a fixed 8-byte buffer for a
string" made sense at the time in ways it doesn't necessarily make all
that much sense today.

So that odd and nasty behavior of strncpy() makes a lot more sense in
the historical context - it's just that that context is 50 years ago.

While mentioning all the oddities of 'strncpy()', it's also worth
noting that despite the similarities in the name,
"strncpy_from_user()" does *not* fill the end of the destination
buffer with NUL characters, and does *not* act like strncpy(). The
user string copy function obviously also has a very very different
return value, which hopefully makes it more obvious that it's a very
different beast.

Most of the time, if you actually want to copy a string, and have a
limited destination buffer, you should use 'strscpy()'. Of the
"limited size string" routines, it's pretty much the only sane one,
and it guarantees - as long as the target size is non-zero - that the
target is NUL-terminated without doing the NUL filling.

(The BSD 'strlcpy()' is horribly broken because the return value
semantics means that it will have to find the terminating NUL of the
*source* string, even if the source string is horribly long, or
untrusted and unterminated).

> Interestingly the kernel runs quite well and we don't see any bigger breakage.

Yeah, almost nothing actually cares about the odd NUL filling that -
as you - few people realize is even part of strncpy().

> Anyway, the function needs fixing.

I'd suggest you just use the default one in lib/string.c and not
override it with __HAVE_ARCH_STRNCPY.

For example, on x86, we for purely historical reasons do that
__HAVE_ARCH_STRNCPY thing, but only for the legacy 32-bit code. x86-64
uses that generic lib/string.c version (and honestly, the 32-bit
arch-specific x86 one is almost certainly much worse, but hey, it
really exists purely due to hysterical raisins).

That generic thing is a slow byte-per-byte implementation, but I don't
think we have ever had a situation where it really matters.

It's also slightly oddly implemented - notice how the way it does that
zero fill is by simply not incrementing the source pointer once it
finds a NUL character. Simple and effective (even if perhaps not
_efficient_), but it can make people miss what is going on because
it's an unusual pattern.

Linus