Re: Linux guest kernel threat model for Confidential Computing

From: Jörg Rödel
Date: Thu Jan 26 2023 - 04:20:58 EST


On Wed, Jan 25, 2023 at 04:16:02PM +0100, Greg Kroah-Hartman wrote:
> Argument that it doesn't work? I thought that ship sailed a long time
> ago but I could be wrong as I don't really pay attention to that stuff
> as it's just vaporware :)

Well, "vaporware" is a bold word, especially given the fact that one can
get a confidential VM using AMD SEV[1] or SEV-SNP[2] the cloud today.
Hardware for SEV-SNP is also widely available since at least October
2021.

But okay, there seems to be some misunderstanding what Confidential
Computing (CoCo) implicates, so let me state my view here.

The vision for CoCo is to remove trust from the hypervisor (HV), so that
a guest owner only needs to trust the hardware and the os vendor for the
VM to be trusted and the data in it to be secure.

The implication is that the guest-HV interface becomes an attack surface
for the guest, and there are two basic strategies to mitigate the risk:

1) Move HV functionality into the guest or the hardware and
reduce the guest-HV interface. This already happened to some
degree with the SEV-ES enablement, where instruction decoding
and handling of most intercepts moved into the guest kernel.

2) Harden the guest-HV interface against malicious input.

Where possible we are going with option 1, up to the point where
scheduling our VCPUs is the only point we need to trust the HV on.

For example, the whole interrupt injection logic will also move either
into guest context or the hardware (depends on the HW vendor). That
covers most of the CPU emulation that the HV was doing, but an equally
important part is device emulation.

For device emulation it is harder to move that into the trusted guest
context, first of all because there is limited hardware support for
that, secondly because it will not perform well.

So device emulation will have to stay in the HV for the forseeable
future (except for devices carrying secrets, like the TPM). What Elena
and others are trying in this thread is to make the wider kernel
community aware that malicious input to a device driver is a real
problem in some environments and driver hardening is actually worthwile.

Regards,

Joerg


[1] https://cloud.google.com/confidential-computing
[2] https://learn.microsoft.com/en-us/azure/confidential-computing/confidential-vm-overview