Re: [PATCHv5 23/30] x86/boot: Avoid #VE during boot for TDX platforms

From: Xiaoyao Li
Date: Mon Mar 07 2022 - 20:19:21 EST


On 3/8/2022 6:33 AM, Kirill A. Shutemov wrote:
On Mon, Mar 07, 2022 at 05:29:27PM +0800, Xiaoyao Li wrote:
...
Even though CPUID reports MCE is supported, all the access to MCE related
MSRs causes #VE. If they are accessed via mce_rdmsrl(), the #VE will be
fixed up and goes to ex_handler_msr_mce(). Finally lead to panic().

It is not panic, but warning. Like this:

unchecked MSR access error: RDMSR from 0x179 at rIP: 0xffffffff810df1e9 (__mcheck_cpu_cap_init+0x9/0x130)
Call Trace:
<TASK>
mcheck_cpu_init+0x3d/0x2c0
identify_cpu+0x85a/0x910
identify_boot_cpu+0xc/0x98
check_bugs+0x6/0xa7
start_kernel+0x363/0x3d1
secondary_startup_64_no_verify+0xe5/0xeb
</TASK>

It is annoying, but not fatal. The patchset is big enough as it is.
I tried to keep patch number under control.


I did hit panic as below.

[ 0.578792] mce: MSR access error: RDMSR from 0x475 at rIP: 0xffffffffb94daa92 (mce_rdmsrl+0x22/0x60)
[ 0.578792] Call Trace:
[ 0.578792] <TASK>
[ 0.578792] machine_check_poll+0xf0/0x260
[ 0.578792] __mcheck_cpu_init_generic+0x3d/0xb0
[ 0.578792] mcheck_cpu_init+0x16b/0x4a0
[ 0.578792] identify_cpu+0x467/0x5c0
[ 0.578792] identify_boot_cpu+0x10/0x9a
[ 0.578792] check_bugs+0x2a/0xa06
[ 0.578792] start_kernel+0x6bc/0x6f1
[ 0.578792] x86_64_start_reservations+0x24/0x26
[ 0.578792] x86_64_start_kernel+0xad/0xb2
[ 0.578792] secondary_startup_64_no_verify+0xe4/0xeb
[ 0.578792] </TASK>
[ 0.578792] Kernel panic - not syncing: MCA architectural violation!
[ 0.578792] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.17.0-rc5-td-guest-upstream+ #2
[ 0.578792] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
[ 0.578792] Call Trace:
[ 0.578792] <TASK>
[ 0.578792] dump_stack_lvl+0x49/0x5f
[ 0.578792] dump_stack+0x10/0x12
[ 0.578792] panic+0xf9/0x2d0
[ 0.578792] ex_handler_msr_mce+0x5e/0x5e
[ 0.578792] fixup_exception+0x2f4/0x310
[ 0.578792] exc_virtualization_exception+0x9b/0x100
[ 0.578792] asm_exc_virtualization_exception+0x12/0x40
[ 0.578792] RIP: 0010:mce_rdmsrl+0x22/0x60
[ 0.578792] Code: a0 b9 e8 75 4d fb ff 90 55 48 89 e5 41 54 53 89 fb 48 c7 c7 9c c1 f6 b9 e8 4b 28 00 00 65 8a 05 97 52 b4 46 84 c0 75 10 89 d9 <0f> 32 48 c1 e2 20 48 09 d0 5b 41 5c 5d c3 89 df e8 c9 5a 17 ff 4c
[ 0.578792] RSP: 0000:ffffffffba203cd8 EFLAGS: 00010246
[ 0.578792] RAX: 0000000000000000 RBX: 0000000000000475 RCX: 0000000000000475
[ 0.578792] RDX: 00000000000001d0 RSI: ffffffffb9f6c19c RDI: ffffffffb9ece016
[ 0.578792] RBP: ffffffffba203ce8 R08: ffffffffba203cb0 R09: ffffffffba203cb4
[ 0.578792] R10: 0000000000000000 R11: 000000000000000f R12: 0000000000000001
[ 0.578792] R13: ffffffffba203dc0 R14: 000000000000000a R15: 000000000000001d
[ 0.578792] ? mce_rdmsrl+0x15/0x60
[ 0.578792] machine_check_poll+0xf0/0x260
[ 0.578792] __mcheck_cpu_init_generic+0x3d/0xb0
[ 0.578792] mcheck_cpu_init+0x16b/0x4a0
[ 0.578792] identify_cpu+0x467/0x5c0
[ 0.578792] identify_boot_cpu+0x10/0x9a
[ 0.578792] check_bugs+0x2a/0xa06
[ 0.578792] start_kernel+0x6bc/0x6f1
[ 0.578792] x86_64_start_reservations+0x24/0x26
[ 0.578792] x86_64_start_kernel+0xad/0xb2
[ 0.578792] secondary_startup_64_no_verify+0xe4/0xeb
[ 0.578792] </TASK>
[ 0.578792] ---[ end Kernel panic - not syncing: MCA architectural violation! ]---