Re: bug in generic strncpy_from_user

From: Heiko Carstens
Date: Tue Feb 26 2013 - 07:58:26 EST


On Mon, Feb 25, 2013 at 09:24:31PM -0800, Linus Torvalds wrote:
> On Mon, Feb 25, 2013 at 8:54 PM, Heiko Carstens
> <heiko.carstens@xxxxxxxxxx> wrote:
> > FWIW, I think the generic strncpy_from_user() implemention has the same
> > bug as the s390 variant:
> >
> > Following example:
> >
> > If there is a NUL terminated string "a\0" starting at e.g. 0xfffe this
> > will cause an (invalid) page fault for the subsequent page at 0x10000
> > in case of CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS for e.g. this invocation:
> >
> > strncpy_from_user(kname, 0xfffe, 4096);
> >
> > If the user space page that contained the string was the last within a
> > VMA this may cause a -EFAULT return value for the __get_user() invocation
> > in strncpy_from_user() and byte-at-a-time processing will happen.
> >
> > This is true and works fine unless the next VMA is the stack.
> >
> > With "ulimit -s unlimited" it will just grow downwards to where the page
> > fault happened instead that -EFAULT will be returned.
>
> This should _never_ be the case.

I was wrong. -EFAULT will be returned, however the vma will grow nevertheless.
Which in turn will cause trouble.

> Our stack-down growing handling requires the whole guard page, and
> should never grow down to touch the previous vma.
>
> If it does, that's a bug in our stack growing, not in
> strncpy_from_user(). Which could definitely be the case, of course.
> But we should fix it there, not change the strncpy.

After all expand_stack() simply expands the stack to the requested address,
since the guard page is considered to be not included it will just grow
exactly down to the previous mapping.
Only in handle_mm_fault() afterwards there is a check if the access happened
within the guard page which then will make the access fault.
However the stack vma has already grown and user space is broken afterwards.

Here an example: this is before the kernel accesses memory right behind the
ld.so.cache mapping:

[root@p2345007 ~]# cat /proc/2285/maps
00400000-00401000 r-xp 00000000 5e:01 148858 /root/a.out
00401000-00402000 rw-p 00000000 5e:01 148858 /root/a.out
40000000-4001c000 r-xp 00000000 5e:01 260585 /lib/ld-2.14.1.so
4001c000-4001e000 rw-p 0001b000 5e:01 260585 /lib/ld-2.14.1.so
4001e000-40020000 r-xp 00000000 00:00 0 [vdso]
40020000-40023000 rw-p 00000000 00:00 0
40023000-4002c000 r--p 00000000 5e:01 39 /etc/ld.so.cache
7ffdf000-80000000 rw-p 00000000 00:00 0 [stack]

Then an "open" system call to the file "/lib/libc.so.6" happens. The string is
null terminated and completely fits into the 0x4002b000 user space page. However
when copying from user space with strncpy_from_user() the kernel also reads from
the 0x4002c000 page.
This generates a fault and grows the stack vma downwards to 0x4002c000:

[root@p2345007 ~]# cat /proc/2285/maps
00400000-00401000 r-xp 00000000 5e:01 148858 /root/a.out
00401000-00402000 rw-p 00000000 5e:01 148858 /root/a.out
40000000-4001c000 r-xp 00000000 5e:01 260585 /lib/ld-2.14.1.so
4001c000-4001e000 rw-p 0001b000 5e:01 260585 /lib/ld-2.14.1.so
4001e000-40020000 r-xp 00000000 00:00 0 [vdso]
40020000-40023000 rw-p 00000000 00:00 0
40023000-4002c000 r--p 00000000 5e:01 39 /etc/ld.so.cache
4002d000-80000000 rw-p 00000000 00:00 0 [stack]

/proc/maps explicitly excludes the guard page, so the vma of the stack indeed
starts at 0x4002c000. Only every access to the guard page will still fault.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/