Re: [PART2 PATCH v5 12/12] svm: Implements update_pi_irte hook to setup posted interrupt

From: Suravee Suthikulpanit
Date: Tue Aug 16 2016 - 11:18:48 EST


Hi Radim,


On 8/13/16 19:03, Radim KrÄmÃÅ wrote:
2016-07-25 04:32-0500, Suravee Suthikulpanit:
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
@@ -1485,9 +1521,16 @@ static void avic_set_running(struct kvm_vcpu *vcpu, bool is_run)
WARN_ON(is_run == !!(entry & AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK));

entry &= ~AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK;
- if (is_run)
+ if (is_run) {
entry |= AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK;
- WRITE_ONCE(*(svm->avic_physical_id_cache), entry);
+ WRITE_ONCE(*(svm->avic_physical_id_cache), entry);
+ avic_update_iommu(vcpu, h_physical_id,
+ page_to_phys(svm->avic_backing_page), 1);
+ } else {
+ avic_update_iommu(vcpu, h_physical_id,
+ page_to_phys(svm->avic_backing_page), 0);
+ WRITE_ONCE(*(svm->avic_physical_id_cache), entry);
+ }

You need to do the same change twice ... I guess it is time to factor
the code. :)

Wouldn't the following be an improvement in the !is_run path too?

static void avic_set_running(struct kvm_vcpu *vcpu, bool is_run)
{
svm->avic_is_running = is_run;

if (is_run)
avic_vcpu_load(vcpu, vcpu->cpu);
else
avic_vcpu_put(vcpu);
}


I like this change. Thanks.

+static void svm_pi_list_add(struct vcpu_svm *svm, struct amd_iommu_pi_data *pi)
+{
+ bool found = false;
+ unsigned long flags;
+ struct amd_iommu_pi_data *cur;
+
+ spin_lock_irqsave(&svm->pi_list_lock, flags);
+ list_for_each_entry(cur, &svm->pi_list, node) {
+ if (cur->ir_data != pi->ir_data)
+ continue;
+ found = true;

This optimization turned out to be ugly ... sorry.

That's okay. It makes sense to avoid using the hash table if we can.

Manipulation with pi_list is hard to understand, IMO, so a comment
explaining why we couldn't do that without traversing a list and
comparing pi->ir_data would be nice.

I'll add more comment here.

Maybe I was a bit confused by reusing amd_iommu_pi_data when all we care
about is a list of cur->ir_data -- can't we have a list of just ir_data?

Actually, in SVM, we care about posted-interrupt information, which is generated from the SVM side, and stored in the amd_iommu_pi_data. This is also communicated to IOMMU via the irq_set_vcpu_affinity().

Here, I only use ir_data to differentiate amd_iommu_pi_data.

[....]
+
+ /* Try to enable guest_mode in IRTE */
+ pi_data->ga_tag = AVIC_GATAG(kvm->arch.avic_vm_id,
+ vcpu->vcpu_id);
+ pi_data->vcpu_data = &vcpu_info;
+ pi_data->is_guest_mode = true;
+ ret = irq_set_vcpu_affinity(host_irq, pi_data);
+
+ /**
+ * We save the pointer to pi_data in the struct
+ * vcpu_svm so that we can reference to them directly
+ * when we update vcpu scheduling information in IOMMU
+ * irte.
+ */
+ if (!ret && pi_data->is_guest_mode)
+ svm_pi_list_add(svm, pi_data);

pi_data leaks in the else case.

(Allocating the element in svm_pi_list_add() would solve this.)

Ahh .. good catch.

Thanks,
Suravee