Re: [PATCH v7 20/22] KVM: s390: add KVM_S390_ZPCI_OP to manage guest zPCI devices

From: Thomas Huth
Date: Wed May 18 2022 - 05:20:01 EST


On 16/05/2022 17.35, Matthew Rosato wrote:
On 5/16/22 5:52 AM, Thomas Huth wrote:
On 13/05/2022 21.15, Matthew Rosato wrote:
The KVM_S390_ZPCI_OP ioctl provides a mechanism for managing
hardware-assisted virtualization features for s390X zPCI passthrough.

s/s390X/s390x/

Add the first 2 operations, which can be used to enable/disable
the specified device for Adapter Event Notification interpretation.

Signed-off-by: Matthew Rosato <mjrosato@xxxxxxxxxxxxx>
---
  Documentation/virt/kvm/api.rst | 45 +++++++++++++++++++
  arch/s390/kvm/kvm-s390.c       | 23 ++++++++++
  arch/s390/kvm/pci.c            | 81 ++++++++++++++++++++++++++++++++++
  arch/s390/kvm/pci.h            |  2 +
  include/uapi/linux/kvm.h       | 31 +++++++++++++
  5 files changed, 182 insertions(+)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 4a900cdbc62e..a7cd5ebce031 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -5645,6 +5645,51 @@ enabled with ``arch_prctl()``, but this may change in the future.
  The offsets of the state save areas in struct kvm_xsave follow the contents
  of CPUID leaf 0xD on the host.
+4.135 KVM_S390_ZPCI_OP
+--------------------
+
+:Capability: KVM_CAP_S390_ZPCI_OP
+:Architectures: s390
+:Type: vcpu ioctl

vcpu? ... you're wiring it up in  kvm_arch_vm_ioctl() later, so I assume it's rather a VM ioctl?

Yup, VM ioctl, bad copy/paste job...


+:Parameters: struct kvm_s390_zpci_op (in)
+:Returns: 0 on success, <0 on error
+
+Used to manage hardware-assisted virtualization features for zPCI devices.
+
+Parameters are specified via the following structure::
+
+  struct kvm_s390_zpci_op {
+    /* in */

If all is "in", why is there a copy_to_user() in the code later?


Oh no, this is a leftover from a prior version...  Good catch.  There should no longer be a copy_to_user.

+    __u32 fh;        /* target device */
+    __u8  op;        /* operation to perform */
+    __u8  pad[3];
+    union {
+        /* for KVM_S390_ZPCIOP_REG_AEN */
+        struct {
+            __u64 ibv;    /* Guest addr of interrupt bit vector */
+            __u64 sb;    /* Guest addr of summary bit */

If this is really a vcpu ioctl, what kind of addresses are you talking about here? virtual addresses? real addresses? absolute addresses?

It's a VM ioctl.  These are guest kernel physical addresses that are later pinned in arch/s390/kvm/pci.c:kvm_s390_pci_aif_enable() as part of handling the ioctl.


+            __u32 flags;
+            __u32 noi;    /* Number of interrupts */
+            __u8 isc;    /* Guest interrupt subclass */
+            __u8 sbo;    /* Offset of guest summary bit vector */
+            __u16 pad;
+        } reg_aen;
+        __u64 reserved[8];
+    } u;
+  };
+
+The type of operation is specified in the "op" field.
+KVM_S390_ZPCIOP_REG_AEN is used to register the VM for adapter event
+notification interpretation, which will allow firmware delivery of adapter
+events directly to the vm, with KVM providing a backup delivery mechanism;
+KVM_S390_ZPCIOP_DEREG_AEN is used to subsequently disable interpretation of
+adapter event notifications.
+
+The target zPCI function must also be specified via the "fh" field. For the
+KVM_S390_ZPCIOP_REG_AEN operation, additional information to establish firmware
+delivery must be provided via the "reg_aen" struct.
+
+The "reserved" field is meant for future extensions.

Maybe also mention the "pad" fields? And add should these also be initialized to 0 by the calling userspace program?

Sure, I can mention them.  And yes, I agree that userspace should initialize them to 0, I'll update the QEMU series accordingly.

I just spotted the corresponding patch in the QEMU series, and I think it should already be fine there, since you're using "= { ... }" while declaring the variables:

+int s390_pci_kvm_aif_disable(S390PCIBusDevice *pbdev)
+{
+ struct kvm_s390_zpci_op args = {
+ .fh = pbdev->fh,
+ .op = KVM_S390_ZPCIOP_DEREG_AEN
+ };

That means unspecified fields will be set to 0 by the compiler already, as far as I know.

Thomas