Re: [v4 RESEND 1/3] KVM: setup empty irq routing when create vm

From: Sean Christopherson
Date: Fri May 03 2024 - 18:08:40 EST


On Fri, Apr 26, 2024, Yi Wang wrote:
> From: Yi Wang <foxywang@xxxxxxxxxxx>
>
> Add a new function to setup empty irq routing in kvm path, which
> can be invoded in non-architecture-specific functions. The difference
> compared to the kvm_setup_empty_irq_routing() is this function just
> alloc the empty irq routing and does not need synchronize srcu, as
> we will call it in kvm_create_vm().
>
> Using the new adding function, we can setup empty irq routing when
> kvm_create_vm(), so that x86 and s390 no longer need to set
> empty/dummy irq routing when creating an IRQCHIP 'cause it avoid
> an synchronize_srcu.
>
> Signed-off-by: Yi Wang <foxywang@xxxxxxxxxxx>
> ---
> include/linux/kvm_host.h | 1 +
> virt/kvm/irqchip.c | 19 +++++++++++++++++++
> virt/kvm/kvm_main.c | 4 ++++
> 3 files changed, 24 insertions(+)
>
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 48f31dcd318a..9256539139ef 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -2100,6 +2100,7 @@ int kvm_set_irq_routing(struct kvm *kvm,
> const struct kvm_irq_routing_entry *entries,
> unsigned nr,
> unsigned flags);
> +int kvm_setup_empty_irq_routing_lockless(struct kvm *kvm);

This is in an #ifdef, the #else needs a stub (for MIPS).

> int kvm_set_routing_entry(struct kvm *kvm,
> struct kvm_kernel_irq_routing_entry *e,
> const struct kvm_irq_routing_entry *ue);
> diff --git a/virt/kvm/irqchip.c b/virt/kvm/irqchip.c
> index 1e567d1f6d3d..266bab99a8a8 100644
> --- a/virt/kvm/irqchip.c
> +++ b/virt/kvm/irqchip.c
> @@ -237,3 +237,22 @@ int kvm_set_irq_routing(struct kvm *kvm,
>
> return r;
> }
> +
> +int kvm_setup_empty_irq_routing_lockless(struct kvm *kvm)

I vote for kvm_init_irq_routing() to make it more obvious that the API is purely
for initializing the routing, i.e. only to be used at VM creation. Then the
"lockless" tag is largely redundant. And then maybe add a comment about how this
creates an empty routing table? Because every time I look at this code, it takes
me a few seconds to remember how this is actually an empty table.

> +{
> + struct kvm_irq_routing_table *new;
> + int chip_size;
> +
> + new = kzalloc(struct_size(new, map, 1), GFP_KERNEL_ACCOUNT);
> + if (!new)
> + return -ENOMEM;
> +
> + new->nr_rt_entries = 1;
> +
> + chip_size = sizeof(int) * KVM_NR_IRQCHIPS * KVM_IRQCHIP_NUM_PINS;
> + memset(new->chip, -1, chip_size);
> +
> + RCU_INIT_POINTER(kvm->irq_routing, new);
> +
> + return 0;
> +}
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index ff0a20565f90..b5f4fa9d050d 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -1285,6 +1285,10 @@ static struct kvm *kvm_create_vm(unsigned long type, const char *fdname)
> if (r)
> goto out_err;
>
> + r = kvm_setup_empty_irq_routing_lockless(kvm);
> + if (r)
> + goto out_err;

This is too late. It might not matter in practice, but the call before this is
to kvm_arch_post_init_vm(), which quite strongly suggests that *all* common setup
is done before that arch hook is invoked.

Calling this right before kvm_arch_init_vm() seems like the best/easiest fit, e.g.

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 2e388972d856..ab607441686f 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1197,9 +1197,13 @@ static struct kvm *kvm_create_vm(unsigned long type, const char *fdname)
rcu_assign_pointer(kvm->buses[i],
kzalloc(sizeof(struct kvm_io_bus), GFP_KERNEL_ACCOUNT));
if (!kvm->buses[i])
- goto out_err_no_arch_destroy_vm;
+ goto out_err_no_irq_routing;
}

+ r = kvm_init_irq_routing(kvm);
+ if (r)
+ goto out_err_no_irq_routing;
+
r = kvm_arch_init_vm(kvm, type);
if (r)
goto out_err_no_arch_destroy_vm;
@@ -1254,6 +1258,8 @@ static struct kvm *kvm_create_vm(unsigned long type, const char *fdname)
WARN_ON_ONCE(!refcount_dec_and_test(&kvm->users_count));
for (i = 0; i < KVM_NR_BUSES; i++)
kfree(kvm_get_bus(kvm, i));
+ kvm_free_irq_routing(kvm);
+out_err_no_irq_routing:
cleanup_srcu_struct(&kvm->irq_srcu);
out_err_no_irq_srcu:
cleanup_srcu_struct(&kvm->srcu);