Re: [PATCH v2] x86/cpuinfo: Clear X86_FEATURE_TME if TME/MKTME is disabled by BIOS
From: Kai Huang
Date: Mon Jul 11 2022 - 20:12:55 EST
On Mon, 2022-07-11 at 17:08 +0000, Sean Christopherson wrote:
> On Tue, Jul 05, 2022, Martin Fernandez wrote:
> > On 7/5/22, Kai Huang <kai.huang@xxxxxxxxx> wrote:
> > > On Mon, 2022-07-04 at 11:22 -0300, Martin Fernandez wrote:
> > > > Changelog since v1
> > > >
> > > > Clear the flag not only for BSP but for every cpu in the system.
>
> ...
>
> > > > ---
> > > > arch/x86/kernel/cpu/intel.c | 1 +
> > > > 1 file changed, 1 insertion(+)
> > > >
> > > > diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
> > > > index fd5dead8371c..17f23e23f911 100644
> > > > --- a/arch/x86/kernel/cpu/intel.c
> > > > +++ b/arch/x86/kernel/cpu/intel.c
> > > > @@ -570,6 +570,7 @@ static void detect_tme(struct cpuinfo_x86 *c)
> > > >
> > > > if (!TME_ACTIVATE_LOCKED(tme_activate) ||
> > > > !TME_ACTIVATE_ENABLED(tme_activate)) {
> > > > pr_info_once("x86/tme: not enabled by BIOS\n");
> > > > + clear_cpu_cap(c, X86_FEATURE_TME);
>
> This misses the case where the TME_ACTIVATE_KEYID_BITS() is zero. AFAICT, that's
> allowed, i.e. won't #GP on WRMSR. TME_ACTIVATE_KEYID_BITS() can't be non-zero if
> TME_ACTIVATE_ENABLED() is false, but the reverse is allowed.
But this logic applies to "whether MKTME is enabled", but not "TME is enabled",
right?
If either LOCKED or ENABLED bit isn't set, then TME is not enabled. My
understanding is this particular patch is trying to fix the issue that TME flag
isn't cleared when TME is disabled by BIOS.
>
> > > > mktme_status = MKTME_DISABLED;
> > > > return;
> > >
> > > This code change itself looks good to me.
> > >
> > > But, TME actually supports bypassing TME encryption/decryption by setting 1
> > > to bit 31 to IA32_TME_ACTIVATE MSR. See 'Table 4-2 IA32_TME_ACTIVATE MSR'
> > > in MKTME spec below:
> > >
> > > https://edc.intel.com/content/www/us/en/design/ipla/software-development-platforms/client/platforms/alder-lake-desktop/12th-generation-intel-core-processors-datasheet-volume-1-of-2/002/intel-multi-key-total-memory-encryption/
> > >
> > > When bit 31 is set, the TME is bypassed (no encryption/decryption for KeyID 0).
> > >
> > > So looks userspace also needs to check this if it wants to truly check
> > > whether "TME memory encryption" is active.
> > >
> > > But perhaps it's arguable whether we can also clear TME flag in this case.
> >
> > Yep, that's what I thought.
>
> IMO, this entire function needs to be reworked to have a cohesive strategy for
> enumerting TME; not just enumerating to userspace, but internal to the kernel as
> well.
>
> E.g. forcing "mktme_status = MKTME_DISABLED" on an AP is nonsensical. If an AP's
> basic MKTME enabling doesn't align with the BSP (activate, algorithm, and keyid0
> bypass settings match), then there's no way an AP is going to reach detect_tme().
> Any discrepancy in encryption for keyid0 will cause the AP will read garbage on
> wakeup, and barring a miracle, will triple fault and never call in.
>
> Conversely, if basic enabling matches but something else mismatches, e.g. an AP
> was configured with fewer keys, then forcing "mktme_status = MKTME_DISABLED" may
> be misleading as MKTME may be fully enabled and in use for keyid0, it just won't
> be used for keyid!=0. But that's a moot point because as is, the kernel _never_
> uses keyid!=0.
>
> And this code is also bogus. Just because the kernel doesn't know the encryption
> algorithm doesn't magically turn off encryption for keyid0. Again, mktme_status
> confuses "memory is encrypted" with "MKTME is theoretically usable for keyid!=0".
>
> tme_crypto_algs = TME_ACTIVATE_CRYPTO_ALGS(tme_activate);
> if (!(tme_crypto_algs & TME_ACTIVATE_CRYPTO_AES_XTS_128)) {
> pr_err("x86/mktme: No known encryption algorithm is supported: %#llx\n",
> tme_crypto_algs);
> mktme_status = MKTME_DISABLED;
> }
>
> The mktme_status variable seems completely pointless. It's not used anywhere
> except to detect that CPU0 vs. APs.
I think your above saying makes sense, but this is a different topic and should
be in a separate patch IMHO.
This patch basically tries to fix the issue that TME flag isn't cleared when TME
is disabled by BIOS. And fir this purpose, the code change in this patch looks
reasonable to me. Unless I am mistaken, detect_tme() will be called for all
cpus if TME is supported in CPUID but isn't enabled by BIOS (either LOCKED or
ENABLED bit isn't set).
>
>
> Something like this seems like a sane approach.
>
> ---
>
> #define MSR_IA32_TME_ACTIVATE 0x982
>
> /* Helpers to access TME_ACTIVATE MSR */
> #define TME_ACTIVATE_LOCKED(x) (x & 0x1)
> #define TME_ACTIVATE_ENABLED(x) (x & 0x2)
>
> #define TME_ACTIVATE_KEYID0_BYPASS(x) (x & BIT(31))
>
> #define TME_ACTIVATE_POLICY(x) ((x >> 4) & 0xf) /* Bits 7:4 */
> #define TME_ACTIVATE_POLICY_AES_XTS_128 0
>
> #define TME_ACTIVATE_KEYID_BITS(x) ((x >> 32) & 0xf) /* Bits 35:32 */
>
> #define TME_ACTIVATE_CRYPTO_ALGS(x) ((x >> 48) & 0xffff) /* Bits 63:48 */
> #define TME_ACTIVATE_CRYPTO_AES_XTS_128 1
>
> static int tme_keyid_bits_cpu0 = -1;
> static u64 tme_activate_cpu0;
>
> static void detect_tme(struct cpuinfo_x86 *c)
> {
> u64 tme_activate, tme_policy, tme_crypto_algs;
>
> rdmsrl(MSR_IA32_TME_ACTIVATE, tme_activate);
>
> if (tme_keyid_bits_cpu0 >= 0) {
> /* Broken BIOS? */
> if (tme_activate != tme_activate_cpu0)
> pr_err_once("x86/tme: configuration is inconsistent between CPUs\n");
>
> /*
> * Proceed, stolen keyid bits still needed to be excluded from
> * x86_phys_bits. The divergence is all but guaranteed to be
> * benign, else this CPU would have died during bringup.
> */
> goto adjust_phys_bits;
> }
>
> tme_activate_cpu0 = tme_activate;
>
> if (!TME_ACTIVATE_LOCKED(tme_activate) ||
> !TME_ACTIVATE_ENABLED(tme_activate))
> tme_keyid_bits_cpu0 = 0;
> else
> tme_keyid_bits_cpu0 = TME_ACTIVATE_KEYID_BITS(tme_activate);
>
> if (!tme_keyid_bits_cpu0) {
> pr_info("x86/tme: not enabled by BIOS\n");
> setup_clear_cpu_cap(X86_FEATURE_TME);
> return;
> }
>
> pr_info("x86/tme: enabled by BIOS\n");
>
> if (TME_ACTIVATE_KEYID0_BYPASS(tme_activate)) {
> pr_info("x86/tme: KeyID=0 encryption bypass enabled\n");
>
> /*
> * Clear the feature flag, memory for keyid0 isn't encrypted so
> * for all intents and purposes MKTME is unused.
> */
> setup_clear_cpu_cap(X86_FEATURE_TME);
> goto adjust_phys_bits;
> }
>
> tme_policy = TME_ACTIVATE_POLICY(tme_activate);
> if (tme_policy != TME_ACTIVATE_POLICY_AES_XTS_128)
> pr_warn("x86/tme: Unknown policy is active: %#llx\n", tme_policy);
>
> tme_crypto_algs = TME_ACTIVATE_CRYPTO_ALGS(tme_activate);
> if (!(tme_crypto_algs & TME_ACTIVATE_CRYPTO_AES_XTS_128))
> pr_warn("x86/mktme: Unknown encryption algorithm is active: %#llx\n",
> tme_crypto_algs);
>
> adjust_phys_bits:
> /*
> * KeyID bits effectively lower the number of physical address bits.
> * Update cpuinfo_x86::x86_phys_bits accordingly. Always use CPU0's
> * info for the adjustment. If CPU0 steals more bits, then aligning
> * with CPU0 gives the highest chance of survival. If CPU0 steals
> * fewer bits, adjusting this CPU's x86_phys_bits won't retroactively
> * fix all the calculations done using CPU0's information
> */
> c->x86_phys_bits -= tme_keyid_bits_cpu0;
> }