Re: [PATCH v12 00/18] Enable FSGSBASE instructions

From: Don Porter
Date: Tue May 26 2020 - 08:42:17 EST


Hi Thomas,

On 5/22/20 8:45 PM, Thomas Gleixner wrote:
Don,

Don Porter <porter@xxxxxxxxxx> writes:
On 5/19/20 12:48 PM, Jarkko Sakkinen wrote:
On Tue, May 19, 2020 at 01:03:25AM +0200, Thomas Gleixner wrote:

That justifies to write books which recommend to load a kernel module
which creates a full unpriviledged root hole. I bet none of these papers
ever mentioned that.

I wanted to clarify that we never intended the Graphene kernel module
you mention for production use, as well as to comment in support of this
patch.

let me clarify, that despite your intentions:

- there is not a single word in any paper, slide deck, documentation
etc. which mentions that loading this module and enabling FSGSBASE
behind the kernels back is a fully unpriviledged root hole.

- the module lacks a big fat warning emitted to dmesg, that this
turns the host kernel into a complete security disaster.

- the module fails to set the TAINT_CRAP flag when initialized.

This shows a pretty obvious discrepancy between intention and action.

I think there is a significant misunderstanding here. This line of research assumes the kernel is already compromised and behaving adversarially toward a more trusted application. Thus, the attack surface under scrutiny in these projects is between the enclave and the rest of the system. Not that we want kernels to be rooted, or make this easier, but exploits happen in practice.

The threat model for Graphene, and most SGX papers, is quite explicit: we assume that Intelâs CPU package, the software in the enclave, and possibly Intelâs Attestation Service (IAS) are the only trusted components. Any other software should be assumed compromised, and one can even assume memory is physically tampered or that one has plugged in an adversarial device. It is not a question of the limitations of the kernel, the threat model assumes that the kernel is already rooted.

For the community these papers are typically written to, this assumption would be well understood. And thus it is common to see code artifacts that might emulate or even undermine security of untrusted components. Not appropriate for production use, but for the typical audience, this risk would be understood. And, initially, when people started using Graphene, I checked who they were - almost exclusively SGX researchers who would have this context. It has only been recently that the interest has grown to a level that these sorts of warnings need to be revised for a more general audience. But the point that we should revise our readme and warnings for a more general audience is well taken.

Having proper in kernel FSGSBASE support is the only solution to that
problem and this has been true since the whole SGX frenzy started. Intel
failed to upstream FSGSBASE since 2015 (sic!). See

https://lore.kernel.org/lkml/alpine.DEB.2.21.1903261010380.1789@xxxxxxxxxxxxxxxxxxxxxxx/

for a detailed time line. And that mail is more than a year old.

Since then there happened even more trainwrecks including the revert of
already queued patches a few days before the 5.3 merge window opened.

After that we saw yet more broken variants of that patch set including
the fail to provide information which is required to re-merge that.

Instead of providing that information the next version re-introduced the
wreckage which was carefully sorted out during earlier review cycles up
to the revert.

So you (and everybody else who has interrest in SGX) just sat there,
watched and hoped that this will solve itself magically. And with that
"hope" argument you really want to make me believe that all of this was
against your intentions?

It's beyond hillarious that the renewed attempt to get FSGSBASE support
merged does not come from the company which has the main interest to get
this solved, i.e Intel.

Yes! I think we are in agreement that we expected Intel to upstream this support - it is their product. I donât see why I am personally responsible to come to the aid of a multi-billion dollar corporation in my free time, or that it is wrong to at least let them try first and see how far they get.

Until recently, we were doing proof-of-concept research, not product development, and there are limited hours in the day. I also hasten to say that the product of research is an article, the software artifact serves as documentation of the experiment. In contrast, the product of software development is software. It takes significant time and effort to convert one to the other. Upstreaming code is of little scientific interest. But things have changed for our project; we had no users in 2015 and we are now un-cutting corners that are appropriate for research but inappropriate for production. For a research artifact with an audience that knew the risks, we shipped a module because it was easier to maintain and install than a kernel patch.

Also, there is a chicken-and-egg problem here: AFAIU a kernel patch needs a userspace demonstration to motivate merging. We canât do a userspace demonstration without this feature. My main interest in showing up for this discussion was to try to make the case that, compared to 2015, there is a more convincing userspace demonstration and larger population of interested users.


Based on your argumentation that all of this is uninteded, I assume that
the pull request on github which removes this security hole from
graphene:

https://github.com/oscarlab/graphene/pull/1529

is perfectly fine, right?

As far as the patch and pull request, I personally think the right thing to do is add the warnings you suggest, help test this or another kernel patch, and advise users that patching their kernel is more secure than this module. I am not in favor of fully deleting the module, in the interest of transparency and reproducibility.


Looking at the advertising which all involved parties including the
Confidential Computing Consortium are conducting, plus the fact that
Intel has major investments in SGX supporting companies and projects,
this is one of the worst marketing scams I've seen in decades.

This all violates the fundamental engineering principle of "correctnes
first" and I'm flabbergasted that academic research has degraded into
the "features first" advocating domain.

What's worse it that public funded research is failing to serve the
public interest and instead is acting as an advertsiing machine for their
corporate sponsors.

Finally, I must rebut the claim that my research abuses public funds to advertise for Intel. I have been working on this problem since before I knew SGX existed, and have been completely transparent regarding subsequent collaborations with Intel. I believe that understanding the pros and cons of different techniques to harden an application against a compromised kernel is in the public interest, and my research projects have been reviewed and overseen according to standard practices at both the university and US government funding agencies. The expectations of agencies in the US funding research are the paper, the insights, and proof-of-concept software; converting proof-of-concept software into production quality is generally considered a ânice to haveâ.

- Don