Re: [PATCH v4 00/10] arm64: Stolen time support

From: Andrew Jones
Date: Tue Sep 03 2019 - 04:03:58 EST


On Fri, Aug 30, 2019 at 09:42:45AM +0100, Steven Price wrote:
> This series add support for paravirtualized time for arm64 guests and
> KVM hosts following the specification in Arm's document DEN 0057A:
>
> https://developer.arm.com/docs/den0057/a
>
> It implements support for stolen time, allowing the guest to
> identify time when it is forcibly not executing.
>
> It doesn't implement support for Live Physical Time (LPT) as there are
> some concerns about the overheads and approach in the above
> specification, and I expect an updated version of the specification to
> be released soon with just the stolen time parts.
>
> NOTE: Patches 8 and 9 will conflict with Mark Rutland's series[1] cleaning
> up the SMCCC conduit. I do feel that the addition of an _invoke() call
> makes a number of call sites cleaner and it should be possible to
> integrate both this and Mark's other cleanups.
>
> [1] https://lore.kernel.org/linux-arm-kernel/20190809132245.43505-1-mark.rutland@xxxxxxx/
>
> Also available as a git tree:
> git://linux-arm.org/linux-sp.git stolen_time/v4
>
> Changes from v3:
> https://lore.kernel.org/lkml/20190821153656.33429-1-steven.price@xxxxxxx/
> * There's no longer a PV_TIME device, instead there are attributes on
> the VCPU. This allows the stolen time structures to be places
> arbitrarily by user space (subject to 64 byte alignment).
> * Split documentation between information on the hypercalls and the
> attributes on the VCPU
> * Fixed the type of SMCCC functions to return long not int
>
> Changes from v2:
> https://lore.kernel.org/lkml/20190819140436.12207-1-steven.price@xxxxxxx/
> * Switched from using gfn_to_hva_cache to a new macro kvm_put_guest()
> that can provide the single-copy atomicity required (on arm64). This
> macro is added in patch 4.
> * Tidied up the locking for kvm_update_stolen_time().
> pagefault_disable() was unnecessary and the caller didn't need to
> take kvm->srcu as the function does it itself.
> * Removed struct kvm_arch_pvtime from the arm implementation, replaced
> instead with inline static functions which are empty for arm.
> * Fixed a few checkpatch --strict warnings.
>
> Changes from v1:
> https://lore.kernel.org/lkml/20190802145017.42543-1-steven.price@xxxxxxx/
> * Host kernel no longer allocates the stolen time structure, instead it
> is allocated by user space. This means the save/restore functionality
> can be removed.
> * Refactored the code so arm has stub implementations and to avoid
> initcall
> * Rebased to pick up Documentation/{virt->virtual} change
> * Bunch of typo fixes
>
> Christoffer Dall (1):
> KVM: arm/arm64: Factor out hypercall handling from PSCI code
>
> Steven Price (9):
> KVM: arm64: Document PV-time interface
> KVM: arm64: Implement PV_FEATURES call
> KVM: Implement kvm_put_guest()
> KVM: arm64: Support stolen time reporting via shared structure
> KVM: Allow kvm_device_ops to be const
> KVM: arm64: Provide VCPU attributes for stolen time
> arm/arm64: Provide a wrapper for SMCCC 1.1 calls
> arm/arm64: Make use of the SMCCC 1.1 wrapper
> arm64: Retrieve stolen time as paravirtualized guest
>
> Documentation/virt/kvm/arm/pvtime.txt | 64 ++++++++++
> Documentation/virt/kvm/devices/vcpu.txt | 14 +++
> arch/arm/include/asm/kvm_host.h | 26 +++++
> arch/arm/kvm/Makefile | 2 +-
> arch/arm/kvm/handle_exit.c | 2 +-
> arch/arm/mm/proc-v7-bugs.c | 13 +--
> arch/arm64/include/asm/kvm_host.h | 30 ++++-
> arch/arm64/include/asm/paravirt.h | 9 +-
> arch/arm64/include/asm/pvclock-abi.h | 17 +++
> arch/arm64/include/uapi/asm/kvm.h | 2 +
> arch/arm64/kernel/cpu_errata.c | 80 +++++--------
> arch/arm64/kernel/paravirt.c | 148 ++++++++++++++++++++++++
> arch/arm64/kernel/time.c | 3 +
> arch/arm64/kvm/Kconfig | 1 +
> arch/arm64/kvm/Makefile | 2 +
> arch/arm64/kvm/guest.c | 9 ++
> arch/arm64/kvm/handle_exit.c | 4 +-
> include/kvm/arm_hypercalls.h | 43 +++++++
> include/kvm/arm_psci.h | 2 +-
> include/linux/arm-smccc.h | 58 ++++++++++
> include/linux/cpuhotplug.h | 1 +
> include/linux/kvm_host.h | 26 ++++-
> include/linux/kvm_types.h | 2 +
> include/uapi/linux/kvm.h | 2 +
> virt/kvm/arm/arm.c | 11 ++
> virt/kvm/arm/hypercalls.c | 68 +++++++++++
> virt/kvm/arm/psci.c | 84 +-------------
> virt/kvm/arm/pvtime.c | 124 ++++++++++++++++++++
> virt/kvm/kvm_main.c | 6 +-
> 29 files changed, 699 insertions(+), 154 deletions(-)
> create mode 100644 Documentation/virt/kvm/arm/pvtime.txt
> create mode 100644 arch/arm64/include/asm/pvclock-abi.h
> create mode 100644 include/kvm/arm_hypercalls.h
> create mode 100644 virt/kvm/arm/hypercalls.c
> create mode 100644 virt/kvm/arm/pvtime.c
>
> --
> 2.20.1
>

Hi Steven,

I had some fun testing this series with the KVM selftests framework. It
looks like it works to me, so you may add

Tested-by: Andrew Jones <drjones@xxxxxxxxxx>

if you like. And below is the test I came up with.

Thanks,
drew


From: Andrew Jones <drjones@xxxxxxxxxx>
Date: Tue, 3 Sep 2019 03:45:08 -0400
Subject: [PATCH] selftests: kvm: aarch64 stolen-time test

Signed-off-by: Andrew Jones <drjones@xxxxxxxxxx>
---
tools/testing/selftests/kvm/Makefile | 1 +
.../selftests/kvm/aarch64/stolen-time.c | 208 ++++++++++++++++++
2 files changed, 209 insertions(+)
create mode 100644 tools/testing/selftests/kvm/aarch64/stolen-time.c

diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index ba7849751989..3151264039ad 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -28,6 +28,7 @@ TEST_GEN_PROGS_x86_64 += clear_dirty_log_test
TEST_GEN_PROGS_x86_64 += dirty_log_test
TEST_GEN_PROGS_x86_64 += kvm_create_max_vcpus

+TEST_GEN_PROGS_aarch64 += aarch64/stolen-time
TEST_GEN_PROGS_aarch64 += clear_dirty_log_test
TEST_GEN_PROGS_aarch64 += dirty_log_test
TEST_GEN_PROGS_aarch64 += kvm_create_max_vcpus
diff --git a/tools/testing/selftests/kvm/aarch64/stolen-time.c b/tools/testing/selftests/kvm/aarch64/stolen-time.c
new file mode 100644
index 000000000000..36df2f6baa17
--- /dev/null
+++ b/tools/testing/selftests/kvm/aarch64/stolen-time.c
@@ -0,0 +1,208 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * AArch64 PV stolen time test
+ *
+ * Copyright (C) 2019, Red Hat, Inc.
+ */
+#define _GNU_SOURCE
+#include <stdio.h>
+#include <time.h>
+#include <sched.h>
+#include <pthread.h>
+#include <sys/syscall.h>
+#include "kvm_util.h"
+
+#define ST_IPA_BASE (1 << 30)
+#define MIN_STOLEN_TIME 200000
+
+struct st_time {
+ uint32_t rev;
+ uint32_t attr;
+ uint64_t st_time;
+};
+
+static uint64_t st_ipa_offset[4];
+static uint64_t guest_stolen_time[4];
+
+static void guest_code(void)
+{
+ struct st_time *st_time;
+ uint64_t cpu;
+ int64_t ipa;
+ int32_t ret;
+
+ asm volatile("mrs %0, mpidr_el1" : "=r" (cpu));
+ cpu &= 0x3;
+
+ asm volatile(
+ "mov x0, %1\n"
+ "mov x1, %2\n"
+ "hvc #0\n"
+ "mov %0, x0\n"
+ : "=r" (ret) : "r" (0x80000001), "r" (0xc5000020) :
+ "x0", "x1", "x2", "x3");
+
+ GUEST_ASSERT(ret == 0);
+
+ asm volatile(
+ "mov x0, %1\n"
+ "mov x1, %2\n"
+ "hvc #0\n"
+ "mov %0, x0\n"
+ : "=r" (ret) : "r" (0xc5000020), "r" (0xc5000022) :
+ "x0", "x1", "x2", "x3");
+
+ GUEST_ASSERT(ret == 0);
+
+ asm volatile(
+ "mov x0, %1\n"
+ "hvc #0\n"
+ "mov %0, x0\n"
+ : "=r" (ipa) : "r" (0xc5000022) :
+ "x0", "x1", "x2", "x3");
+
+ GUEST_ASSERT(ipa == ST_IPA_BASE + st_ipa_offset[cpu]);
+
+ st_time = (struct st_time *)ipa;
+ GUEST_ASSERT(st_time->rev == 0);
+ GUEST_ASSERT(st_time->attr == 0);
+
+ guest_stolen_time[cpu] = st_time->st_time;
+ GUEST_SYNC(0);
+
+ guest_stolen_time[cpu] = st_time->st_time;
+ GUEST_DONE();
+}
+
+static long get_run_delay(void)
+{
+ char path[64];
+ long val[2];
+ FILE *fp;
+
+ sprintf(path, "/proc/%ld/schedstat", syscall(SYS_gettid));
+ fp = fopen(path, "r");
+ fscanf(fp, "%ld %ld ", &val[0], &val[1]);
+ fclose(fp);
+
+ return val[1];
+}
+
+static void *steal_time(void *arg)
+{
+ uint64_t nsecs_per_sec = 1000000000ul;
+ uint64_t sec, nsec;
+ struct timespec ts;
+
+ clock_gettime(CLOCK_MONOTONIC, &ts);
+ sec = ts.tv_sec;
+ nsec = ts.tv_nsec + MIN_STOLEN_TIME;
+ if (nsec > nsecs_per_sec) {
+ sec += 1;
+ nsec -= nsecs_per_sec;
+ }
+
+ while (1) {
+ clock_gettime(CLOCK_MONOTONIC, &ts);
+ if (ts.tv_sec > sec || (ts.tv_sec == sec && ts.tv_nsec >= nsec))
+ break;
+ }
+
+ return NULL;
+}
+
+static void run_vcpu(struct kvm_vm *vm, uint32_t vcpuid)
+{
+ struct ucall uc;
+
+ vcpu_ioctl(vm, vcpuid, KVM_RUN, NULL);
+
+ switch (get_ucall(vm, vcpuid, &uc)) {
+ case UCALL_SYNC:
+ case UCALL_DONE:
+ break;
+ case UCALL_ABORT:
+ TEST_ASSERT(false, "%s at %s:%d", (const char *)uc.args[0],
+ __FILE__, uc.args[1]);
+ default:
+ TEST_ASSERT(false, "Unexpected exit: %s",
+ exit_reason_str(vcpu_state(vm, vcpuid)->exit_reason));
+ }
+}
+
+int main(int ac, char **av)
+{
+ struct kvm_device_attr dev = {
+ .group = KVM_ARM_VCPU_PVTIME_CTRL,
+ .attr = KVM_ARM_VCPU_PVTIME_SET_IPA,
+ };
+ uint64_t pvtime_memslot_size;
+ struct kvm_vm *vm;
+ pthread_attr_t attr;
+ pthread_t thread;
+ cpu_set_t cpuset;
+ long stolen_time;
+ int i;
+
+ CPU_ZERO(&cpuset);
+ CPU_SET(0, &cpuset);
+ pthread_attr_init(&attr);
+ pthread_attr_setaffinity_np(&attr, sizeof(cpu_set_t), &cpuset);
+ pthread_setaffinity_np(pthread_self(), sizeof(cpu_set_t), &cpuset);
+
+ pvtime_memslot_size = 64 * 1024; /* one maximum-sized host page */
+
+ /* create a one-vcpu guest and the pvtime memslot */
+ vm = vm_create_default(0, 0, guest_code);
+ vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS, ST_IPA_BASE, 1,
+ 16 /* vm uses 4k pages */, 0);
+ virt_map(vm, ST_IPA_BASE, ST_IPA_BASE, pvtime_memslot_size, 0);
+ ucall_init(vm, UCALL_MMIO, NULL);
+
+ /* add 3 more vcpus */
+ for (i = 1; i < 4; ++i) {
+ vm_vcpu_add_default(vm, i, guest_code);
+ st_ipa_offset[i] = i * 64;
+ sync_global_to_guest(vm, st_ipa_offset[i]);
+ }
+
+ /* add pvtime to each vcpu */
+ for (i = 0; i < 4; ++i) {
+ uint64_t st_ipa = ST_IPA_BASE + st_ipa_offset[i];
+ dev.addr = (uint64_t)&st_ipa;
+ vcpu_ioctl(vm, i, KVM_HAS_DEVICE_ATTR, &dev);
+ vcpu_ioctl(vm, i, KVM_SET_DEVICE_ATTR, &dev);
+ }
+
+ /* run the tests on each vcpu */
+ for (i = 0; i < 4; ++i) {
+ /* first vcpu run */
+ run_vcpu(vm, i);
+ sync_global_from_guest(vm, guest_stolen_time[i]);
+ TEST_ASSERT(guest_stolen_time[i] == 0, "Expected stolen_time = 0");
+
+ /* steal time from the vcpu */
+ stolen_time = get_run_delay();
+ pthread_create(&thread, &attr, steal_time, NULL);
+ pthread_yield();
+ pthread_join(thread, NULL);
+ stolen_time = get_run_delay() - stolen_time;
+ TEST_ASSERT(stolen_time >= MIN_STOLEN_TIME,
+ "Expected stolen time >= %ld, got %ld",
+ MIN_STOLEN_TIME, stolen_time);
+
+ /* run vcpu again and check the stolen time */
+ run_vcpu(vm, i);
+ sync_global_from_guest(vm, guest_stolen_time[i]);
+ TEST_ASSERT(guest_stolen_time[i] >= stolen_time,
+ "Expected stolen_time >= %ld, got %ld",
+ stolen_time, guest_stolen_time[i]);
+
+ printf("CPU%d: %ld", i, guest_stolen_time[i]);
+ if (stolen_time == guest_stolen_time[i])
+ printf(" (BONUS: guest stolen_time even exactly matches run_delay)");
+ printf("\n");
+ }
+
+ return 0;
+}
--
2.18.1