Re: [PATCH v3 00/13] Virtually mapped stacks with guard pages (x86, core)

From: Andy Lutomirski
Date: Tue Jun 21 2016 - 13:33:35 EST


On Tue, Jun 21, 2016 at 10:16 AM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Tue, Jun 21, 2016 at 9:45 AM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
>>
>> So I'm leaning toward fewer cache entries per cpu, maybe just one.
>> I'm all for making it a bit faster, but I think we should weigh that
>> against increasing memory usage too much and thus scaring away the
>> embedded folks.
>
> I don't think the embedded folks will be scared by a per-cpu cache, if
> it's just one or two entries. And I really do think that even just
> one or two entries will indeed catch a lot of the cases.
>
> And yes, fork+execve() is too damn expensive in page table build-up
> and tear-down. I'm not sure why bash doesn't do vfork+exec for when it
> has to wait for the process anyway, but it doesn't seem to do that.
>

I don't know about bash, but glibc very recently fixed a long-standing
but in posix_spawn and started using clone() in a sensible manner for
this.

FWIW, it may be a while before this can be enabled in distro kernels.
There are some code paths (*cough* crypto users *cough*) that think
that calling sg_init_one with a stack address is a reasonable thing to
do, and it doesn't work with a vmalloced stack. grsecurity works
around this by using a real lowmem higher-order stack, aliasing it
into vmalloc space, and arranging for virt_to_phys to backtrack the
alias, but eww. I think I'd rather find and fix the bugs, assuming
they're straightforward.

--Andy