Re: [PATCH v19 101/130] KVM: TDX: handle ept violation/misconfig exit

From: Reinette Chatre
Date: Wed May 01 2024 - 12:54:30 EST


Hi Isaku,

On 5/1/2024 8:56 AM, Isaku Yamahata wrote:
> On Tue, Apr 30, 2024 at 01:47:07PM -0700,
> Reinette Chatre <reinette.chatre@xxxxxxxxx> wrote:
>> On 4/3/2024 11:42 AM, Isaku Yamahata wrote:
>>> On Mon, Apr 01, 2024 at 12:10:58PM +0800,
>>> Chao Gao <chao.gao@xxxxxxxxx> wrote:
>>>
>>>>> +static int tdx_handle_ept_violation(struct kvm_vcpu *vcpu)
>>>>> +{
>>>>> + unsigned long exit_qual;
>>>>> +
>>>>> + if (kvm_is_private_gpa(vcpu->kvm, tdexit_gpa(vcpu))) {
>>>>> + /*
>>>>> + * Always treat SEPT violations as write faults. Ignore the
>>>>> + * EXIT_QUALIFICATION reported by TDX-SEAM for SEPT violations.
>>>>> + * TD private pages are always RWX in the SEPT tables,
>>>>> + * i.e. they're always mapped writable. Just as importantly,
>>>>> + * treating SEPT violations as write faults is necessary to
>>>>> + * avoid COW allocations, which will cause TDAUGPAGE failures
>>>>> + * due to aliasing a single HPA to multiple GPAs.
>>>>> + */
>>>>> +#define TDX_SEPT_VIOLATION_EXIT_QUAL EPT_VIOLATION_ACC_WRITE
>>>>> + exit_qual = TDX_SEPT_VIOLATION_EXIT_QUAL;
>>>>> + } else {
>>>>> + exit_qual = tdexit_exit_qual(vcpu);
>>>>> + if (exit_qual & EPT_VIOLATION_ACC_INSTR) {
>>>>
>>>> Unless the CPU has a bug, instruction fetch in TD from shared memory causes a
>>>> #PF. I think you can add a comment for this.
>>>
>>> Yes.
>>>
>>>
>>>> Maybe KVM_BUG_ON() is more appropriate as it signifies a potential bug.
>>>
>>> Bug of what component? CPU. If so, I think KVM_EXIT_INTERNAL_ERROR +
>>> KVM_INTERNAL_ERROR_UNEXPECTED_EXIT_REASON is more appropriate.
>>>
>>
>> Is below what you have in mind?
>
> Yes. data[0] should be the raw value of exit reason if possible.
> data[2] should be exit_qual. Hmm, I don't find document on data[] for

Did you perhaps intend to write "data[1] should be exit_qual" or would you
like to see ndata = 3? I followed existing usages, for example [1] and [2],
that have ndata = 2 with "data[1] = vcpu->arch.last_vmentry_cpu".

> KVM_INTERNAL_ERROR_UNEXPECTED_EXIT_REASON.
> Qemu doesn't assumt ndata = 2. Just report all data within ndata.

I am not sure I interpreted your response correctly so I share one possible
snippet below as I interpret it. Could you please check where I misinterpreted
you? I could also make ndata = 3 to break the existing custom and add
"data[2] = vcpu->arch.last_vmentry_cpu" to match existing pattern. What do you
think?

diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
index 499c6cd9633f..ba81e6f68c97 100644
--- a/arch/x86/kvm/vmx/tdx.c
+++ b/arch/x86/kvm/vmx/tdx.c
@@ -1305,11 +1305,20 @@ static int tdx_handle_ept_violation(struct kvm_vcpu *vcpu)
} else {
exit_qual = tdexit_exit_qual(vcpu);
if (exit_qual & EPT_VIOLATION_ACC_INSTR) {
+ union tdx_exit_reason exit_reason = to_tdx(vcpu)->exit_reason;
+
+ /*
+ * Instruction fetch in TD from shared memory
+ * causes a #PF.
+ */
pr_warn("kvm: TDX instr fetch to shared GPA = 0x%lx @ RIP = 0x%lx\n",
tdexit_gpa(vcpu), kvm_rip_read(vcpu));
- vcpu->run->exit_reason = KVM_EXIT_EXCEPTION;
- vcpu->run->ex.exception = PF_VECTOR;
- vcpu->run->ex.error_code = exit_qual;
+ vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
+ vcpu->run->internal.suberror =
+ KVM_INTERNAL_ERROR_UNEXPECTED_EXIT_REASON;
+ vcpu->run->internal.ndata = 2;
+ vcpu->run->internal.data[0] = exit_reason.full;
+ vcpu->run->internal.data[1] = exit_qual;
return 0;
}
}

Reinette

[1] https://github.com/kvm-x86/linux/blob/next/arch/x86/kvm/vmx/vmx.c#L6587
[2] https://github.com/kvm-x86/linux/blob/next/arch/x86/kvm/svm/svm.c#L3436