Re: [RFC PATCH v1 5/6] powerpc/mm: Add a framework for Kernel Userspace Access Protection

From: Christophe LEROY
Date: Wed Nov 21 2018 - 03:32:42 EST




Le 21/11/2018 Ã 03:26, Russell Currey a ÃcritÂ:
On Wed, 2018-11-07 at 16:56 +0000, Christophe Leroy wrote:
This patch implements a framework for Kernel Userspace Access
Protection.

Then subarches will have to possibility to provide their own
implementation
by providing setup_kuap(), and lock/unlock_user_rd/wr_access

We separate read and write accesses because some subarches like
book3s32 might only support write access protection.

Signed-off-by: Christophe Leroy <christophe.leroy@xxxxxx>

Separating read and writes does have a performance impact, I'm doing
some benchmarking to find out exactly how much - but at least for radix
it means we have to do a RMW instead of just a write. It does add some
amount of security, though.

The other issue I have is that you're just locking everything here
(like I was), and not doing anything different for just reads or
writes. In theory, wouldn't someone assume that they could (for
example) unlock reads, lock writes, then attempt to read? At which
point the read would fail, because the lock actually locks both.

I would think we either need to bundle read/write locking/unlocking
together, or only implement this on platforms that can do one at a
time, unless there's a cleaner way to handle this. Glancing at the
values you use for 8xx, this doesn't seem possible there, and it's a
definite performance hit for radix.

At the same time, as you say, it would suck for book3s32 that can only
do writes, but maybe just doing both at the same time and if
implemented for that platform it could just have a warning that it only
applies to writes on init?

Well, I see your points. My idea was not to separate read and write
on platform that can lock both. I think it is no problem to also unlocking writes when we are doing a read, so on platforms that can do both I think both should do the same..

The idea was to avoid spending time unlocking writes for doing a read on platforms on which reads are not locked. And for platforms able to independently unlock/lock reads and writes, if only unlocking reads can improve performance it can be interesting as well.

For book3s/32, locking/unlocking will be done through Kp/Ks bits in segment registers, the function won't be trivial as it may involve more than one segment at a time. So I just wanted to avoid spending time doing that for reads as reads won't be protected. And may also be the case on older book3s/64, may not it ?
On Book3s/32, the page protection bits are as follows:

Key 0 1
PP
00 RW NA
01 RW RO
10 RW RW
11 RO RO

So the idea is to encode user RW with PP01 (instead of PP10 today) and user RO with PP11 (as done today), giving Key0 to user and Key1 to kernel (today both user and kernel have Key1). Then when kernel needs to write, we change Ks to Key0 in segment register for the involved segments.

I'm not sure there is any risk that someone nests unlocks/locks for reads and unlocks/locks for writes, because the unlocks/locks are done in very limited places.

Christophe



Curious for people's thoughts on this.

- Russell