RE: Linux guest kernel threat model for Confidential Computing

From: Reshetova, Elena
Date: Wed Feb 08 2023 - 05:44:43 EST




> On Tue, Feb 07, 2023 at 08:51:56PM -0500, Theodore Ts'o wrote:
> > Why not just simply compile a special CoCo kernel that doesn't have
> > any drivers that you don't trust.

Aside from complexity and scalability management of such a config that has
to change with every kernel release, what about the build-in platform drivers?
I am not a driver expert here but as far as I understand they cannot be disabled
via config. Please correct if this statement is wrong.

> In order to make $$$$$, you need to push the costs onto various
> different players in the ecosystem. This is cleverly disguised as
> taking current perfectly acceptable design paradigm when the trust
> boundary is in the traditional location, and causing all of the
> assumptions which you have broken as "bugs" that must be fixed by
> upstream developers.

The CC threat model does change the traditional linux trust boundary regardless of
what mitigations are used (kernel config vs. runtime filtering). Because for the
drivers that CoCo guest happens to need, there is no way to fix this problem by
either of these mechanisms (we cannot disable the code that we need), unless somebody
writes a totally new set of coco specific drivers (who needs another set of
CoCo specific virtio drivers in the kernel?).

So, if the path is to be able to use existing driver kernel code, then we need:

1. these selective CoCo guest required drivers (small set) needs to be hardened
(or whatever word people prefer to use here), which only means that in
the presence of malicious host/hypervisor that can manipulate pci config space,
port IO and MMIO, these drivers should not expose CC guest memory
confidentiality or integrity (including via privilege escalation into CC guest).
Please note that this only applies to a small set (in tdx virtio setup we have less
than 10 of them) of drivers and does not present invasive changes to the kernel
code. There is also an additional core pci/msi code that is involved with discovery
and configuration of these drivers, this code also falls into the category we need to
make robust.

2. rest of non-needed drivers must be disabled. Here we can argue about what
is the correct method of doing this and who should bare the costs of enforcing it.
But from pure security point of view: the method that is simple and clear, that
requires as little maintenance as possible usually has the biggest chance of
enforcing security.
And given that we already have the concept of authorized devices in Linux,
does this method really brings so much additional complexity to the kernel?
But hard to argue here without the code: we need to submit the filter proposal first
(under internal review still).

Best Regards,
Elena.