Re: [PATCH 24/30] x86, kaiser: disable native VSYSCALL

From: Andy Lutomirski
Date: Thu Nov 09 2017 - 21:26:06 EST


On Thu, Nov 9, 2017 at 5:22 PM, Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx> wrote:
> On 11/09/2017 05:04 PM, Andy Lutomirski wrote:
>> On Thu, Nov 9, 2017 at 4:57 PM, Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx> wrote:
>>> On 11/09/2017 04:53 PM, Andy Lutomirski wrote:
>>>>> The KAISER code attempts to "poison" the user portion of the kernel page
>>>>> tables. It detects the entries pages that it wants that it wants to
>>>>> poison in two ways:
>>>>> * Looking for addresses >= PAGE_OFFSET
>>>>> * Looking for entries without _PAGE_USER set
>>>> What do you mean "poison"?
>>>
>>> I meant the _PAGE_NX magic that we do in here:
>>>
>>> https://git.kernel.org/pub/scm/linux/kernel/git/daveh/x86-kaiser.git/commit/?h=kaiser-414rc7-20171108&id=c4f7d0819170761f092fcf2327b85b082368e73a
>>>
>>> to ensure that userspace is unable to run on the kernel PGD.
>>
>> Aha, I get it. Why not just drop the _PAGE_USER check? You could
>> instead warn if you see a _PAGE_USER page that doesn't have the
>> correct address for the vsyscall.
>
> The _PAGE_USER check helps us with kernel things that want to create
> mappings below PAGE_OFFSET. The EFI code was the prime user for this.
> Without this, we poison the EFI mappings and the EFI calls die.

OK, let's see if I understand. EFI and maybe some other stuff creates
low mappings with _PAGE_USER clear that are intended to be executed in
kernel mode, and, if you just set NX on all low mappings in kernel
mode, then it doesn't work.

Here are two proposals to address this without breaking vsyscalls.

1. Set NX on low mappings that are _PAGE_USER. Don't set NX on high
mappings but, optionally, warn if you see _PAGE_USER on any address
that isn't the vsyscall page.

2. Ignore _PAGE_USER entirely and just mark the EFI mm as special so
KAISER doesn't muck with it.

--Andy