Re: PKU usage improvements for threads

From: Dave Hansen
Date: Thu Aug 25 2022 - 10:37:31 EST


On 8/25/22 05:30, Stephen Röttger wrote:
>>> We were also thinking about if this should be a more generic feature instead of
>>> being tied to pkeys. I.e. the doc above has an alternative proposal to introduce
>>> something like a memory seal/unseal syscall.
>>> I was personally leaning towards using pkeys for this for a few reasons:
>>> * intuitively it would make sense to me to extend PKEY_DISABLE_ACCESS
>>> to also mean disable all changes to the memory, not just the data.
>> It would make some sense, but we can't do it with the existing
>> PKEY_DISABLE_ACCESS ABI. It would surely break existing users if they
>> couldn't munmap() memory that was PKEY_DISABLE_ACCESS.
> Our thought was that this could be opt-in with a prctl().

So, today, you have this:

foo = malloc(PAGE_SIZE);
pkey_mprotect(foo, PAGE_SIZE, READ|WRITE, pkey=1);
munmap(foo); // <-- works fine
mmap(hint=foo, ...); // now attacker controls &foo

Which is problematic. What you want instead is something like this:

prctl(PR_ARCH_NO_MUNMAP_ON_PKEY); // or whatever
foo = malloc(PAGE_SIZE);
pkey_mprotect(foo, PAGE_SIZE, READ|WRITE, pkey=1);
wrpkru(PKEY_DISABLE_ACCESS<<pkey*2);
munmap(foo); // returns -EPERM (or whatever)

Which requires the kernel to check when it's modifying a VMA (like the
munmap() above) to see if PKRU _currently_ permits access to the VMA's
contents. If not, the kernel should refuse to modify the VMA.

Like I said, I don't think this is _insane_, but I can see it breaking
perfectly innocent things. For instance, an app that today does a
free() if pkey-assigned memory might work perfectly fine for a long time
since that memory is rarely unmapped. But, the minute that malloc()
decides it needs to zap the memory, *malloc()* will fail.

I also wonder how far these semantics would go. Would madvise() work on
these access-denied VMAs?

My gut says that we don't want to mix up pkey semantics with this new
mechanism.