Re: Linux guest kernel threat model for Confidential Computing

From: Christophe de Dinechin
Date: Mon Jan 30 2023 - 06:43:43 EST


Hi Elena,

On 2023-01-25 at 12:28 UTC, "Reshetova, Elena" <elena.reshetova@xxxxxxxxx> wrote...
> Hi Greg,
>
> You mentioned couple of times (last time in this recent thread:
> https://lore.kernel.org/all/Y80WtujnO7kfduAZ@xxxxxxxxx/) that we ought to start
> discussing the updated threat model for kernel, so this email is a start in this direction.
>
> (Note: I tried to include relevant people from different companies, as well as linux-coco
> mailing list, but I hope everyone can help by including additional people as needed).
>
> As we have shared before in various lkml threads/conference presentations
> ([1], [2], [3] and many others), for the Confidential Computing guest kernel, we have a
> change in the threat model where guest kernel doesn’t anymore trust the hypervisor.
> This is a big change in the threat model and requires both careful assessment of the
> new (hypervisor <-> guest kernel) attack surface, as well as careful design of mitigations
> and security validation techniques. This is the activity that we have started back at Intel
> and the current status can be found in
>
> 1) Threat model and potential mitigations:
> https://intel.github.io/ccc-linux-guest-hardening-docs/security-spec.html

I only looked at this one so far. Here are a few quick notes:

DoS attacks are out of scope. What about timing attacks, which were the
basis of some of the most successful attacks in the past years? My
understanding is that TDX relies on existing mitigations, and does not
introduce anythign new in that space. Worth mentioning in that "out of
scope" section IMO.

Why are TDVMCALL hypercalls listed as an "existing" communication interface?
That seems to exclude the TDX module from the TCB. Also, "shared memory for
I/Os" seems unnecessarily restrictive, since it excludes interrupts, timing
attacks, network or storage attacks, or devices passed through to the guest.
The latter category seems important to list, since there are separate
efforts to provide confidential computing capabilities e.g. to PCI devices,
which were discussed elsewhere in this thread.

I suspect that my question above is due to ambiguous wording. What I
initially read as "this is out of scope for TDX" morphs in the next
paragraph into "we are going to explain how to mitigate attacks through
TDVMCALLS and shared memory for I/O". Consider rewording to clarify the
intent of these paragraphs.

Nit: I suggest adding bullets to the items below "between host/VMM and the
guest"

You could count the "unique code locations" that can consume malicious input
in drivers, why not in core kernel? I think you write elsewhere that the
drivers account for the vast majority, so I suspect you have the numbers.

"The implementation of the #VE handler is simple and does not require an
in-depth security audit or fuzzing since it is not the actual consumer of
the host/VMM supplied untrusted data": The assumption there seems to be that
the host will never be able to supply data (e.g. through a bounce buffer)
that it can trick the guest into executing. If that is indeed the
assumption, it is worth mentioning explicitly. I suspect it is a bit weak,
since many earlier attacks were based on executing the wrong code. Notably,
it is worth pointing out that I/O buffers are _not_ encrypted with the CPU
key (as opposed to any device key e.g. for PCI encryption) in either
TDX or SEV. Is there for example anything that precludes TDX or SEV from
executing code in the bounce buffers?

"We only care about users that read from MMIO": Why? My guess is that this
is the only way bad data could be fed to the guest. But what if a bad MMIO
write due to poisoned data injected earlier was a necessary step to open the
door to a successful attack?


>
> 2) One of the described in the above doc mitigations is "hardening of the enabled
> code". What we mean by this, as well as techniques that are being used are
> described in this document:
> https://intel.github.io/ccc-linux-guest-hardening-docs/tdx-guest-hardening.html
>
> 3) All the tools are open-source and everyone can start using them right away even
> without any special HW (readme has description of what is needed).
> Tools and documentation is here:
> https://github.com/intel/ccc-linux-guest-hardening
>
> 4) all not yet upstreamed linux patches (that we are slowly submitting) can be found
> here: https://github.com/intel/tdx/commits/guest-next
>
> So, my main question before we start to argue about the threat model, mitigations, etc,
> is what is the good way to get this reviewed to make sure everyone is aligned?
> There are a lot of angles and details, so what is the most efficient method?
> Should I split the threat model from https://intel.github.io/ccc-linux-guest-hardening-docs/security-spec.html
> into logical pieces and start submitting it to mailing list for discussion one by one?
> Any other methods?
>
> The original plan we had in mind is to start discussing the relevant pieces when submitting the code,
> i.e. when submitting the device filter patches, we will include problem statement, threat model link,
> data, alternatives considered, etc.
>
> Best Regards,
> Elena.
>
> [1] https://lore.kernel.org/all/20210804174322.2898409-1-sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx/
> [2] https://lpc.events/event/16/contributions/1328/
> [3] https://events.linuxfoundation.org/archive/2022/linux-security-summit-north-america/program/schedule/


--
Cheers,
Christophe de Dinechin (https://c3d.github.io)
Theory of Incomplete Measurements (https://c3d.github.io/TIM)