RE: Linux guest kernel threat model for Confidential Computing

From: Reshetova, Elena
Date: Tue Jan 31 2023 - 05:08:31 EST


Hi Dinechin,

Thank you very much for your review! Please find the replies inline.

>
> Hi Elena,
>
> On 2023-01-25 at 12:28 UTC, "Reshetova, Elena" <elena.reshetova@xxxxxxxxx>
> wrote...
> > Hi Greg,
> >
> > You mentioned couple of times (last time in this recent thread:
> > https://lore.kernel.org/all/Y80WtujnO7kfduAZ@xxxxxxxxx/) that we ought to
> start
> > discussing the updated threat model for kernel, so this email is a start in this
> direction.
> >
> > (Note: I tried to include relevant people from different companies, as well as
> linux-coco
> > mailing list, but I hope everyone can help by including additional people as
> needed).
> >
> > As we have shared before in various lkml threads/conference presentations
> > ([1], [2], [3] and many others), for the Confidential Computing guest kernel, we
> have a
> > change in the threat model where guest kernel doesn’t anymore trust the
> hypervisor.
> > This is a big change in the threat model and requires both careful assessment of
> the
> > new (hypervisor <-> guest kernel) attack surface, as well as careful design of
> mitigations
> > and security validation techniques. This is the activity that we have started back
> at Intel
> > and the current status can be found in
> >
> > 1) Threat model and potential mitigations:
> > https://intel.github.io/ccc-linux-guest-hardening-docs/security-spec.html
>
> I only looked at this one so far. Here are a few quick notes:
>
> DoS attacks are out of scope. What about timing attacks, which were the
> basis of some of the most successful attacks in the past years? My
> understanding is that TDX relies on existing mitigations, and does not
> introduce anythign new in that space. Worth mentioning in that "out of
> scope" section IMO.

It is not out of the scope because TD guest SW has to think about these
matters and protect adequately. We have a section lower on " Transient Execution attacks
mitigation" https://intel.github.io/ccc-linux-guest-hardening-docs/security-spec.html#transient-execution-attacks-and-their-mitigation
but I agree it is worth pointing to this (and generic side-channel attacks) already
in the scoping. I will make an update.

>
> Why are TDVMCALL hypercalls listed as an "existing" communication interface?
> That seems to exclude the TDX module from the TCB.

I believe this is just ambiguous wording, I need to find a better one.
TDVMCALL is indeed a *new* TDX specific communication interface, but it is
only a transport in this case for the actual *existing* legacy communication interfaces
between the VM guest and host/hypervisor (read/write MSRs, pci config space
access, port IO and MMIO, etc).

Also, "shared memory for
> I/Os" seems unnecessarily restrictive, since it excludes interrupts, timing
> attacks, network or storage attacks, or devices passed through to the guest.
> The latter category seems important to list, since there are separate
> efforts to provide confidential computing capabilities e.g. to PCI devices,
> which were discussed elsewhere in this thread.

The second bullet meant to say that we also have another interface how CoCo guest
and host/VMM can communicate and it is done via shared pages (vs private pages that
are only accessible to confidential computing guest). Maybe I should drop the "IO" part of
this and it would avoid confusion. The other means (some are higher-level abstractions
like disk operations that happen over bounce buffer in shared memory), like interrupts, disk, etc,
we do cover below in separate sections of the doc with exception of covering
CoCo-enabled devices. This is smth we can briefly mention as an addition, but since
we don’t have these devices yet, and neither we have linux implementation that
can securely add them to the CoCo guest, I find it preliminary to discuss details at this point.


> I suspect that my question above is due to ambiguous wording. What I
> initially read as "this is out of scope for TDX" morphs in the next
> paragraph into "we are going to explain how to mitigate attacks through
> TDVMCALLS and shared memory for I/O". Consider rewording to clarify the
> intent of these paragraphs.
>

Sure, sorry for ambiguous wording, will try to clarify.

> Nit: I suggest adding bullets to the items below "between host/VMM and the
> guest"

Yes, it used to have it actually, have to see what happened with recent docs update.

>
> You could count the "unique code locations" that can consume malicious input
> in drivers, why not in core kernel? I think you write elsewhere that the
> drivers account for the vast majority, so I suspect you have the numbers.

I don’t have the ready numbers for core kernel, but if really needed, I can calculate them.
Here https://github.com/intel/ccc-linux-guest-hardening/tree/master/bkc/audit/sample_output/6.0-rc2
you can find the public files that would produce this data:

https://github.com/intel/ccc-linux-guest-hardening/blob/master/bkc/audit/sample_output/6.0-rc2/smatch_warns_6.0_tdx_allyesconfig
is all hits (with taint propagation) for the whole allyesconfig (x86 build, CONFIG_COMPILE_TEST is off).
https://github.com/intel/ccc-linux-guest-hardening/blob/master/bkc/audit/sample_output/6.0-rc2/smatch_warns_6.0_tdx_allyesconfig_filtered
is the same but with most of the drivers dropped.


>
> "The implementation of the #VE handler is simple and does not require an
> in-depth security audit or fuzzing since it is not the actual consumer of
> the host/VMM supplied untrusted data": The assumption there seems to be that
> the host will never be able to supply data (e.g. through a bounce buffer)
> that it can trick the guest into executing. If that is indeed the
> assumption, it is worth mentioning explicitly. I suspect it is a bit weak,
> since many earlier attacks were based on executing the wrong code. Notably,
> it is worth pointing out that I/O buffers are _not_ encrypted with the CPU
> key (as opposed to any device key e.g. for PCI encryption) in either
> TDX or SEV. Is there for example anything that precludes TDX or SEV from
> executing code in the bounce buffers?

This was already replied by Kirill, any code execution out of shared memory generates
a #GP.

>
> "We only care about users that read from MMIO": Why? My guess is that this
> is the only way bad data could be fed to the guest. But what if a bad MMIO
> write due to poisoned data injected earlier was a necessary step to open the
> door to a successful attack?

The entry point of the attack is still a "read". The situation you describe can happen,
but the root cause would be still an incorrectly handled MMIO read and this is what
we try to check with both fuzzing and auditing the 'read' entry points.

Thank you again for the review!

Best Regards,
Elena.