Re: [RFC PATCH v2 17/27] x86/cet/shstk: User-mode shadow stack support

From: Yu-cheng Yu
Date: Fri Jul 13 2018 - 14:06:48 EST


On Wed, 2018-07-11 at 15:21 -0700, Andy Lutomirski wrote:
> >
> > On Jul 11, 2018, at 2:51 PM, Jann Horn <jannh@xxxxxxxxxx> wrote:
> >
> > On Wed, Jul 11, 2018 at 2:34 PM Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
> > >
> > > >
> > > > On Jul 11, 2018, at 2:10 PM, Jann Horn <jannh@xxxxxxxxxx> wrote:
> > > >
> > > > >
> > > > > On Tue, Jul 10, 2018 at 3:31 PM Yu-cheng Yu <yu-cheng.yu@xxxxxxxxx> wrote:
> > > > >
> > > > > This patch adds basic shadow stack enabling/disabling routines.
> > > > > A task's shadow stack is allocated from memory with VM_SHSTK
> > > > > flag set and read-only protection.ÂÂThe shadow stack is
> > > > > allocated to a fixed size.
> > > > >
> > > > > Signed-off-by: Yu-cheng Yu <yu-cheng.yu@xxxxxxxxx>
> > > > [...]
> > > > >
> > > > > diff --git a/arch/x86/kernel/cet.c b/arch/x86/kernel/cet.c
> > > > > new file mode 100644
> > > > > index 000000000000..96bf69db7da7
> > > > > --- /dev/null
> > > > > +++ b/arch/x86/kernel/cet.c
> > > > [...]
> > > > >
> > > > > +static unsigned long shstk_mmap(unsigned long addr, unsigned long len)
> > > > > +{
> > > > > +ÂÂÂÂÂÂÂstruct mm_struct *mm = current->mm;
> > > > > +ÂÂÂÂÂÂÂunsigned long populate;
> > > > > +
> > > > > +ÂÂÂÂÂÂÂdown_write(&mm->mmap_sem);
> > > > > +ÂÂÂÂÂÂÂaddr = do_mmap(NULL, addr, len, PROT_READ,
> > > > > +ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂMAP_ANONYMOUS | MAP_PRIVATE, VM_SHSTK,
> > > > > +ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ0, &populate, NULL);
> > > > > +ÂÂÂÂÂÂÂup_write(&mm->mmap_sem);
> > > > > +
> > > > > +ÂÂÂÂÂÂÂif (populate)
> > > > > +ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂmm_populate(addr, populate);
> > > > > +
> > > > > +ÂÂÂÂÂÂÂreturn addr;
> > > > > +}
> > [...]
> > >
> > > >
> > > > Should the kernel enforce that two shadow stacks must have a guard
> > > > page between them so that they can not be directly adjacent, so that
> > > > if you have too much recursion, you can't end up corrupting an
> > > > adjacent shadow stack?
> > > I think the answer is a qualified ânoâ. I would like to instead enforce a general guard page on all mmaps that donât use MAP_FORCE. We *might* need to exempt any mmap with an address hint for
> > > compatibility.
> > I like this idea a lot.
> >
> > >
> > > My commercial software has been manually adding guard pages on every single mmap done by tcmalloc for years, and it has caught a couple bugs and costs essentially nothing.
> > >
> > > Hmm. Linux should maybe add something like Windowsâ âreservedâ virtual memory. Itâs basically a way to ask for a VA range that explicitly contains nothing and can be subsequently be turned into
> > > something useful with the equivalent of MAP_FORCE.
> > What's the benefit over creating an anonymous PROT_NONE region? That
> > the kernel won't have to scan through the corresponding PTEs when
> > tearing down the mapping?
> Make it more obvious whatâs happening and avoid accounting issues?ÂÂWhat Iâve actually used is MAP_NORESERVE | PROT_NONE, but I think this still counts against the VA rlimit. But maybe thatâs
> actually the desired behavior.

We can put a NULL at both ends of a SHSTK to guard against corruption.

Yu-chengÂ