Re: [PATCH v5 01/10] KVM: s390: Extend MEM_OP ioctl by storage key checked cmpxchg

From: Janis Schoetterl-Glausch
Date: Wed Jan 11 2023 - 05:05:06 EST


On Wed, 2023-01-11 at 08:59 +0100, Thomas Huth wrote:
> On 10/01/2023 21.26, Janis Schoetterl-Glausch wrote:
> > User space can use the MEM_OP ioctl to make storage key checked reads
> > and writes to the guest, however, it has no way of performing atomic,
> > key checked, accesses to the guest.
> > Extend the MEM_OP ioctl in order to allow for this, by adding a cmpxchg
> > mode. For now, support this mode for absolute accesses only.
> >
> > This mode can be use, for example, to set the device-state-change
> > indicator and the adapter-local-summary indicator atomically.
> >
> > Signed-off-by: Janis Schoetterl-Glausch <scgl@xxxxxxxxxxxxx>
> > ---
> > include/uapi/linux/kvm.h | 7 +++
> > arch/s390/kvm/gaccess.h | 3 ++
> > arch/s390/kvm/gaccess.c | 102 +++++++++++++++++++++++++++++++++++++++
> > arch/s390/kvm/kvm-s390.c | 41 +++++++++++++++-
> > 4 files changed, 151 insertions(+), 2 deletions(-)
> >
[...]

> > +/**
> > + * cmpxchg_guest_abs_with_key() - Perform cmpxchg on guest absolute address.
> > + * @kvm: Virtual machine instance.
> > + * @gpa: Absolute guest address of the location to be changed.
> > + * @len: Operand length of the cmpxchg, required: 1 <= len <= 16. Providing a
> > + * non power of two will result in failure.
> > + * @old_addr: Pointer to old value. If the location at @gpa contains this value, the
> > + * exchange will succeed. After calling cmpxchg_guest_abs_with_key() *@old
> > + * contains the value at @gpa before the attempt to exchange the value.
> > + * @new: The value to place at @gpa.
> > + * @access_key: The access key to use for the guest access.
> > + *
> > + * Atomically exchange the value at @gpa by @new, if it contains *@old.
> > + * Honors storage keys.
> > + *
> > + * Return: * 0: successful exchange
> > + * * 1: exchange unsuccessful
> > + * * a program interruption code indicating the reason cmpxchg could
> > + * not be attempted
>
> PGM_OPERATION has also the value 1 ... can we be sure that it never happens
> here?

Currently yes, only program errors are those explicit in the code,
PGM_ADDRESSING and PGM_PROTECTION.

> ... maybe it would make sense to use KVM_S390_MEMOP_R_NO_XCHG for
> return value here instead of 1, too, just to be on the safe side?

I didn't like that idea because I consider KVM_S390_MEMOP_R_NO_XCHG to be
part of the KVM's api surface and cmpxchg_guest_abs_with_key is an internal
function that shouldn't concern itself with that.

But being unclear on PGM_OPERATION is indeed ugly.
Maybe I should just replace "a program interruption code ..." with the specific ones?
>
> Apart from that, patch looks fine to me.
>
> Thomas
>
>
> > + * * -EINVAL: address misaligned or len not power of two
> > + * * -EAGAIN: transient failure (len 1 or 2)
> > + * * -EOPNOTSUPP: read-only memslot (should never occur)
> > + */
> > +int cmpxchg_guest_abs_with_key(struct kvm *kvm, gpa_t gpa, int len,
> > + __uint128_t *old_addr, __uint128_t new,
> > + u8 access_key)
> > +{
> > + gfn_t gfn = gpa >> PAGE_SHIFT;
> > + struct kvm_memory_slot *slot = gfn_to_memslot(kvm, gfn);
> > + bool writable;
> > + hva_t hva;
> > + int ret;
> > +
> > + if (!IS_ALIGNED(gpa, len))
> > + return -EINVAL;
> > +
> > + hva = gfn_to_hva_memslot_prot(slot, gfn, &writable);
> > + if (kvm_is_error_hva(hva))
> > + return PGM_ADDRESSING;
> > + /*
> > + * Check if it's a read-only memslot, even though that cannot occur
> > + * since those are unsupported.
> > + * Don't try to actually handle that case.
> > + */
> > + if (!writable)
> > + return -EOPNOTSUPP;
> > +
> > + hva += offset_in_page(gpa);
> > + switch (len) {
> > + case 1: {
> > + u8 old;
> > +
> > + ret = cmpxchg_user_key((u8 *)hva, &old, *old_addr, new, access_key);
> > + ret = ret < 0 ? ret : old != *old_addr;
> > + *old_addr = old;
> > + break;
> > + }
> > + case 2: {
> > + u16 old;
> > +
> > + ret = cmpxchg_user_key((u16 *)hva, &old, *old_addr, new, access_key);
> > + ret = ret < 0 ? ret : old != *old_addr;
> > + *old_addr = old;
> > + break;
> > + }
> > + case 4: {
> > + u32 old;
> > +
> > + ret = cmpxchg_user_key((u32 *)hva, &old, *old_addr, new, access_key);
> > + ret = ret < 0 ? ret : old != *old_addr;
> > + *old_addr = old;
> > + break;
> > + }
> > + case 8: {
> > + u64 old;
> > +
> > + ret = cmpxchg_user_key((u64 *)hva, &old, *old_addr, new, access_key);
> > + ret = ret < 0 ? ret : old != *old_addr;
> > + *old_addr = old;
> > + break;
> > + }
> > + case 16: {
> > + __uint128_t old;
> > +
> > + ret = cmpxchg_user_key((__uint128_t *)hva, &old, *old_addr, new, access_key);
> > + ret = ret < 0 ? ret : old != *old_addr;
> > + *old_addr = old;
> > + break;
> > + }
> > + default:
> > + return -EINVAL;
> > + }
> > + mark_page_dirty_in_slot(kvm, slot, gfn);
> > + /*
> > + * Assume that the fault is caused by protection, either key protection
> > + * or user page write protection.
> > + */
> > + if (ret == -EFAULT)
> > + ret = PGM_PROTECTION;
> > + return ret;
> > +}
[...]