Re: Redoing eXclusive Page Frame Ownership (XPFO) with isolated CPUs in mind (for KVM to isolate its guests per CPU)

From: Khalid Aziz
Date: Fri Sep 07 2018 - 17:31:02 EST


On 08/30/2018 10:00 AM, Julian Stecklina wrote:
Hey everyone,

On Mon, 20 Aug 2018 15:27 Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
On Mon, Aug 20, 2018 at 3:02 PM Woodhouse, David <dwmw@xxxxxxxxxxxx> wrote:

It's the *kernel* we don't want being able to access those pages,
because of the multitude of unfixable cache load gadgets.

Ahh.

I guess the proof is in the pudding. Did somebody try to forward-port
that patch set and see what the performance is like?

I've been spending some cycles on the XPFO patch set this week. For the
patch set as it was posted for v4.13, the performance overhead of
compiling a Linux kernel is ~40% on x86_64[1]. The overhead comes almost
completely from TLB flushing. If we can live with stale TLB entries
allowing temporary access (which I think is reasonable), we can remove
all TLB flushing (on x86). This reduces the overhead to 2-3% for
kernel compile.

There were no problems in forward-porting the patch set to master.
You can find the result here, including a patch makes the TLB flushing
configurable:
http://git.infradead.org/users/jsteckli/linux-xpfo.git/shortlog/refs/heads/xpfo-master

It survived some casual stress-ng runs. I can rerun the benchmarks on
this version, but I doubt there is any change.

It used to be just 500 LOC. Was that because they took horrible
shortcuts?

The patch is still fairly small. As for the horrible shortcuts, I let
others comment on that.


Looks like the performance impact can be whole lot worse. On my test system with 2 Xeon Platinum 8160 (HT enabled) CPUs and 768 GB of memory, I am seeing very high penalty with XPFO when building 4.18.6 kernel sources with "make -j60":

No XPFO patch XPFO patch(No TLB flush) XPFO(TLB Flush)
sys time 52m 54.036s 55m 47.897s 434m 8.645s

That is ~8% worse with TLB flush disabled and ~720% worse with TLB flush enabled. This test was with kernel sources being compiled on an ext4 filesystem. XPFO seems to affect ext2 even more. With ext2 filesystem, impact was ~18.6% and ~900%.

--
Khalid