Re: [RFC PATCH] x86/mm/fault: Allow stack access below %rsp

From: Andy Lutomirski
Date: Mon Nov 05 2018 - 00:15:07 EST


On Sun, Nov 4, 2018 at 9:11 PM Andy Lutomirski <luto@xxxxxxxxxx> wrote:
>
> On Fri, Nov 2, 2018 at 3:28 PM Dave Hansen <dave.hansen@xxxxxxxxx> wrote:
> >
> > On 11/2/18 12:50 PM, Waiman Long wrote:
> > > On 11/02/2018 03:44 PM, Dave Hansen wrote:
> > >> On 11/2/18 12:40 PM, Waiman Long wrote:
> > >>> The 64k+ limit check is kind of arbitrary. So the check is now removed
> > >>> to just let expand_stack() decide if a segmentation fault should happen.
> > >> With the 64k check removed, what's the next limit that we bump into? Is
> > >> it just the stack_guard_gap space above the next-lowest VMA?
> > > I think it is both the stack_guard_gap space above the next lowest VMA
> > > and the rlimit(RLIMIT_STACK).
> >
> > The gap seems to be hundreds of megabytes, typically where RLIMIT_STACK
> > is 8MB by default, so RLIMIT_STACK is likely to be the practical limit
> > that will be hit. So, practically, we've taken a ~64k area that we
> > would on-demand extend the stack into in one go, and turned that into a
> > the full ~8MB area that you could have expanded into anyway, but all at
> > once.
> >
> > That doesn't seem too insane, especially since we don't physically back
> > the 8MB or anything. Logically, it also seems like you *should* be able
> > to touch any bit of the stack within the rlimit.
> >
> > But, on the other hand, as our comments say: "Accessing the stack below
> > %sp is always a bug." Have we been unsuccessful in convincing our gcc
> > buddies of this?
>
> FWIW, the old code is a bit bogus. Why are we restricting the range
> of stack expending addresses for user code without restricting the
> range of kernel uaccess addresses that would do the same thing?
>
> So I think I agree with the patch.

I should add: if this patch is *not* applied, then I think we'll need
to replace the sw_error_code check with user_mode(regs) to avoid an
info leak if CET is enabled. Because, with CET, WRUSS will allow a
*kernel* mode access (where regs->sp is the kernel stack pointer) with
user permissions.