Re: [PATCH v4 3/5] stack: Optionally randomize kernel stack offset each syscall

From: Arvind Sankar
Date: Mon Jun 22 2020 - 20:05:16 EST


On Mon, Jun 22, 2020 at 04:07:11PM -0700, Kees Cook wrote:
> On Mon, Jun 22, 2020 at 06:56:15PM -0400, Arvind Sankar wrote:
> > On Mon, Jun 22, 2020 at 12:31:44PM -0700, Kees Cook wrote:
> > > +
> > > +#define add_random_kstack_offset() do { \
> > > + if (static_branch_maybe(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT, \
> > > + &randomize_kstack_offset)) { \
> > > + u32 offset = this_cpu_read(kstack_offset); \
> > > + u8 *ptr = __builtin_alloca(offset & 0x3FF); \
> > > + asm volatile("" : "=m"(*ptr)); \
> > > + } \
> > > +} while (0)
> >
> > This feels a little fragile. ptr doesn't escape the block, so the
> > compiler is free to restore the stack immediately after this block. In
> > fact, given that all you've said is that the asm modifies *ptr, but
> > nothing uses that output, the compiler could eliminate the whole thing,
> > no?
> >
> > https://godbolt.org/z/HT43F5
> >
> > gcc restores the stack immediately, if no function calls come after it.
> >
> > clang completely eliminates the code if no function calls come after.
>
> nothing uses the stack in your example. And adding a barrier (which is
> what the "=m" is, doesn't change it.

Yeah, I realized that that was what's going on. And clang isn't actually
DCE'ing it, it's taking advantage of the red zone since my alloca was
small enough.

But I still don't see anything _stopping_ the compiler from optimizing
this better in the future. The "=m" is not a barrier: it just informs
the compiler that the asm produces an output value in *ptr (and no other
outputs). If nothing can consume that output, it doesn't stop the
compiler from freeing the allocation immediately after the asm instead
of at the end of the function.

I'm talking about something like
asm volatile("" : : "r" (ptr) : "memory");
which tells the compiler that the asm may change memory arbitrarily.

Here, we don't use it really as a barrier, but to tell the compiler that
the asm may have stashed the value of ptr somewhere in memory, so it's
not free to reuse the space that it pointed to until the function
returns (unless it can prove that nothing accesses memory, not just that
nothing accesses ptr).