Re: [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface

From: Eric W. Biederman
Date: Mon Mar 09 2015 - 19:12:07 EST


Kees Cook <keescook@xxxxxxxxxxxx> writes:

> On Mon, Mar 9, 2015 at 3:13 PM, Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote:
>> Dave Hansen <dave@xxxxxxxx> writes:
>>
>>> From: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
>>>
>>> Physical addresses are sensitive information. There are
>>> existing, known exploits that are made easier if physical
>>> information is available. Here is one example:
>>>
>>> http://www.cs.columbia.edu/~vpk/papers/ret2dir.sec14.pdf
>>>
>>> If you know the physical address of something you also know at
>>> which kernel virtual address you can find something (modulo
>>> highmem). It means that things that keep the kernel from
>>> accessing user mappings (like SMAP/SMEP) can be worked around
>>> because the _kernel_ mapping can get used instead.
>>>
>>> But, /proc/$pid/pagemap exposes the physical addresses of all
>>> pages accessible to userspace. This works against all of the
>>> efforts to keep kernel addresses out of places where unprivileged
>>> apps can find them.
>>>
>>> This patch introduces a "paranoid" option for /proc. It can be
>>> enabled like this:
>>>
>>> mount -o remount,paranoid /proc
>>>
>>> Or when /proc is mounted initially. When 'paranoid' mode is
>>> active, opens to /proc/$pid/pagemap will return -EPERM for users
>>> without CAP_SYS_RAWIO. It can be disabled like this:
>>>
>>> mount -o remount,notparanoid /proc
>>>
>>> The option is applied to the pid namespace, so an app that wanted
>>> a separate policy from the rest of the system could get run in
>>> its own pid namespace.
>>>
>>> I'm not really that stuck on the name. I'm not opposed to making
>>> it apply only to pagemap or to giving it a pagemap-specific
>>> name.
>>>
>>> pagemap is also the kind of feature that could be used to escalate
>>> privileged from root in to the kernel. It probably needs to be
>>> protected in the same way that /dev/mem or module loading is in
>>> cases where the kernel needs to be protected from root, thus the
>>> choice to use CAP_SYS_RAWIO.
>>
>>
>> There is already a way to make pagemap go away. It is called
>> CONFIG_PROC_PAGE_MONITOR.
>>
>> I suspect the right answer here is if you enable kernel address
>> randomization you disable CONFIG_PROC_PAGE_MONTIOR. Aka you make the
>> two options conflict with each other.
>
> It's not a good idea to make CONFIG options conflict with each other
> like this as it puts distros is a tricky spot to decide which to use.
> Allowing both and having a runtime flag of some kind tends to be the
> better option (e.g. kASLR vs Hibernation).

But there is a fundamental conflict. As such it might as well be
expressed in Kconfig.

>> That is a lot less code and a lot less to maintain.
>>
>> On the other hand if this is truly a valuable interface that you can't
>> part with we need an alternative to pagemaps that does the same job
>> with out the exploit potential. And I don't how to do that.
>>
>> Arguing in favor of just making the options conflict is the fact that
>> kernel address randomization is pretty much snake oil. At least on
>> x86_64 the address pool is so small it can be trivially brute forced. I
>> think there are maybe 10 bits you can randomize within.
>>
>> As for a way to disable this I expect it would do better with something
>> like a set once flag that prevents a process and all of it's children
>> from accessing this file.
>>
>> *Blink* *Blink* Did you say you are worried about escalting privileges
>> from root into the kernel space. That is non-sense. We give root the
>> power to shot themselves in the foot and any proc option will be
>> something that root will be able to get around.
>>
>> The pieces of the patch description don't add up.
>
> No, that's an entirely valid use-case. You can trust the kernel but
> not root. This is the point of the "trusted_kernel" patch series that
> disables all sorts of dangerous interfaces that allow root to get at
> physical memory.
>
> This situation is more a memory leak than a direct compromise, so it
> seems like providing at least some runtime control of it (separate
> from potential future "trusted_kernel" stuff) makes sense.

I am too tired to argue about the kASLR snake-oil.

I do not think a proc mount option is at all apropriate for controlling
the behavior of the pagemap file. And "paranoid" is entirely too
generic of a string to have any meaning.

Either just tighten the permissions when kASLR is enabled, or have the
file go away entirely.

If you want run-time knobs there are all kinds of run-time knobs you can
use.

If the concern is to protect against root getting into the kernel the
"trusted_kernel" snake-oil just compile out the pagemap file. Nothing
else is remotely interesting from a mainenance point of view.

As I said.
Nacked-by: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx>

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/