Re: [kernel-hardening] Re: [RFC PATCH 6/6] arm64: add VMAP_STACK and detect out-of-bounds SP

From: Ard Biesheuvel
Date: Fri Jul 14 2017 - 08:55:52 EST


On 14 July 2017 at 13:52, Mark Rutland <mark.rutland@xxxxxxx> wrote:
> On Fri, Jul 14, 2017 at 11:48:20AM +0100, Ard Biesheuvel wrote:
>> On 14 July 2017 at 11:32, Mark Rutland <mark.rutland@xxxxxxx> wrote:
>> > On Thu, Jul 13, 2017 at 07:28:48PM +0100, Ard Biesheuvel wrote:
>> >> On 13 July 2017 at 18:55, Mark Rutland <mark.rutland@xxxxxxx> wrote:
>> >> > On Thu, Jul 13, 2017 at 05:10:50PM +0100, Mark Rutland wrote:
>> >> >> On Thu, Jul 13, 2017 at 12:49:48PM +0100, Ard Biesheuvel wrote:
>> >> >> > On 13 July 2017 at 11:49, Mark Rutland <mark.rutland@xxxxxxx> wrote:
>> >> >> > > On Thu, Jul 13, 2017 at 07:58:50AM +0100, Ard Biesheuvel wrote:
>> >> >> > >> On 12 July 2017 at 23:33, Mark Rutland <mark.rutland@xxxxxxx> wrote:
>> > This means that we have to align the initial task, so the kernel Image
>> > will grow by THREAD_SIZE. Likewise for IRQ stacks, unless we can rework
>> > things such that we can dynamically allocate all of those.
>> >
>>
>> We can't currently do that for 64k pages, since the segment alignment
>> is only 64k. But we should be able to patch that up I think
>
> I was assuming that the linked would bump up the segment alignment if a
> more-aligned object were placed inside. I guess that doesn't happen in
> all cases?
>
> ... or do you mean when the EFI stub relocates the kernel, assuming
> relaxed alignment constraints?
>

No, I mean under KASLR, which randomizes at SEGMENT_ALIGN granularity.

>> >> >> I believe that determining whether the exception was caused by a stack
>> >> >> overflow is not something we can do robustly or efficiently.
>> >>
>> >> Actually, if the stack pointer is within S_FRAME_SIZE of the base, and
>> >> the faulting address points into the guard page, that is a pretty
>> >> strong indicator that the stack overflowed. That shouldn't be too
>> >> costly?
>> >
>> > Sure, but that's still a a heuristic. For example, that also catches an
>> > unrelated vmalloc address gone wrong, while SP was close to the end of
>> > the stack.
>>
>> Yes, but the likelihood that an unrelated stray vmalloc access is
>> within 16 KB of a stack pointer that is close ot its limit is
>> extremely low, so we should be able to live with the risk of
>> misidentifying it.
>
> I guess, but at that point, why bother?
>
> That gives us a fuzzy check for one specific "stack overflow", while not
> catching the general case.
>
> So long as we have a reliable stack trace, we can figure out that was
> the case, and we don't set the expectation that we're trying to
> categorize the general case (minefield and all).
>

Yes. As long as the context is described accurately, there is no need
to make any inferences on behalf of the user.