RE: [PATCH v2 2/2] x86/tsx: Disable TSX development mode at boot

From: Krishnan, Neelima
Date: Wed Apr 06 2022 - 16:54:04 EST




-----Original Message-----
From: Pawan Gupta <pawan.kumar.gupta@xxxxxxxxxxxxxxx>
Sent: Tuesday, March 29, 2022 10:28 PM
To: Borislav Petkov <bp@xxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>; Ingo Molnar <mingo@xxxxxxxxxx>; Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>; x86@xxxxxxxxxx; H. Peter Anvin <hpa@xxxxxxxxx>; Andi Kleen <ak@xxxxxxxxxxxxxxx>; Luck, Tony <tony.luck@xxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx; antonio.gomez.iglesias@xxxxxxxxxxxxxxx; Krishnan, Neelima <neelima.krishnan@xxxxxxxxx>; stable@xxxxxxxxxxxxxxx; Cooper, Andrew <andrew.cooper3@xxxxxxxxxx>; Poimboe, Josh <jpoimboe@xxxxxxxxxx>
Subject: Re: [PATCH v2 2/2] x86/tsx: Disable TSX development mode at boot

On Tue, Mar 29, 2022 at 06:24:03PM +0200, Borislav Petkov wrote:
>On Thu, Mar 10, 2022 at 02:02:09PM -0800, Pawan Gupta wrote:
>> A microcode update on some Intel processors causes all TSX
>> transactions to always abort by default [*]. Microcode also added
>> functionality to re-enable TSX for development purpose. With this
>> microcode loaded, if tsx=on was passed on the cmdline, and TSX
>> development mode was already enabled before the kernel boot, it may
>> make the system vulnerable to TSX Asynchronous Abort (TAA).
>>
>> To be on safer side, unconditionally disable TSX development mode at
>> boot. If needed, a user can enable it using msr-tools.
>>
>> [*] Intel Transactional Synchronization Extension (Intel TSX) Disable Update for Selected Processors
>> https://cdrdv2.intel.com/v1/dl/getContent/643557
>>
>> Suggested-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
>> Suggested-by: Borislav Petkov <bp@xxxxxxxxx>
>> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@xxxxxxxxxxxxxxx>
>> Cc: <stable@xxxxxxxxxxxxxxx>
>> ---
>> arch/x86/include/asm/msr-index.h | 4 +--
>> arch/x86/kernel/cpu/cpu.h | 1 +
>> arch/x86/kernel/cpu/intel.c | 4 +++
>> arch/x86/kernel/cpu/tsx.c | 34 ++++++++++++++++++++++++++
>> tools/arch/x86/include/asm/msr-index.h | 4 +--
>> 5 files changed, 43 insertions(+), 4 deletions(-)
>
>Does this a lot more encapsulated version work too?

>Neelima is testing this patch, she will share the results tomorrow.

Following up on this email thread, I did some basic functional validation of the patch[1].
Initially I ran into the bug where the mitigation was getting disabled in one CPU after a suspend/resume [2].
But after applying the patch [1] on latest upstream, with the fix for restoring speculation related MSRs during s3 resume [2], my tests are passing.

Quick summary of testcases executed:
Testcase 1: Verify RTM_ALLOW was getting reset after kexec reboot
Testcase2: Verify TSX_CTRL_MSR is restored after system goes to S3 suspend state

[1] https://lore.kernel.org/lkml/YkMyo2Jw8iYx9wAU@xxxxxxx/
[2] https://github.com/torvalds/linux/commit/e2a1256b17b16f9b9adf1b6fea56819e7b68e463

Thanks
Neelima