Re: [PATCH v3 12/12] KVM: arm64: selftests: arch_timer: Support vCPU migration

From: Raghavendra Rao Ananta
Date: Tue Sep 07 2021 - 12:15:33 EST


On Sun, Sep 5, 2021 at 11:39 PM Andrew Jones <drjones@xxxxxxxxxx> wrote:
>
> On Fri, Sep 03, 2021 at 01:53:27PM -0700, Raghavendra Rao Ananta wrote:
> > On Fri, Sep 3, 2021 at 4:05 AM Andrew Jones <drjones@xxxxxxxxxx> wrote:
> > >
> > > On Wed, Sep 01, 2021 at 09:14:12PM +0000, Raghavendra Rao Ananta wrote:
> > > > Since the timer stack (hardware and KVM) is per-CPU, there
> > > > are potential chances for races to occur when the scheduler
> > > > decides to migrate a vCPU thread to a different physical CPU.
> > > > Hence, include an option to stress-test this part as well by
> > > > forcing the vCPUs to migrate across physical CPUs in the
> > > > system at a particular rate.
> > > >
> > > > Originally, the bug for the fix with commit 3134cc8beb69d0d
> > > > ("KVM: arm64: vgic: Resample HW pending state on deactivation")
> > > > was discovered using arch_timer test with vCPU migrations and
> > > > can be easily reproduced.
> > > >
> > > > Signed-off-by: Raghavendra Rao Ananta <rananta@xxxxxxxxxx>
> > > > ---
> > > > .../selftests/kvm/aarch64/arch_timer.c | 108 +++++++++++++++++-
> > > > 1 file changed, 107 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/tools/testing/selftests/kvm/aarch64/arch_timer.c b/tools/testing/selftests/kvm/aarch64/arch_timer.c
> > > > index 1383f33850e9..de246c7afab2 100644
> > > > --- a/tools/testing/selftests/kvm/aarch64/arch_timer.c
> > > > +++ b/tools/testing/selftests/kvm/aarch64/arch_timer.c
> > > > @@ -14,6 +14,8 @@
> > > > *
> > > > * The test provides command-line options to configure the timer's
> > > > * period (-p), number of vCPUs (-n), and iterations per stage (-i).
> > > > + * To stress-test the timer stack even more, an option to migrate the
> > > > + * vCPUs across pCPUs (-m), at a particular rate, is also provided.
> > > > *
> > > > * Copyright (c) 2021, Google LLC.
> > > > */
> > > > @@ -24,6 +26,8 @@
> > > > #include <pthread.h>
> > > > #include <linux/kvm.h>
> > > > #include <linux/sizes.h>
> > > > +#include <linux/bitmap.h>
> > > > +#include <sys/sysinfo.h>
> > > >
> > > > #include "kvm_util.h"
> > > > #include "processor.h"
> > > > @@ -41,12 +45,14 @@ struct test_args {
> > > > int nr_vcpus;
> > > > int nr_iter;
> > > > int timer_period_ms;
> > > > + int migration_freq_ms;
> > > > };
> > > >
> > > > static struct test_args test_args = {
> > > > .nr_vcpus = NR_VCPUS_DEF,
> > > > .nr_iter = NR_TEST_ITERS_DEF,
> > > > .timer_period_ms = TIMER_TEST_PERIOD_MS_DEF,
> > > > + .migration_freq_ms = 0, /* Turn off migrations by default */
> > >
> > > I'd rather we enable good tests like these by default.
> > >
> > Well, that was my original idea, but I was concerned about the ease
> > for diagnosing
> > things since it'll become too noisy. And so I let it as a personal
> > preference. But I can
> > include it back and see how it goes.
>
> Break the tests into two? One run without migration and one with. If
> neither work, then we can diagnose the one without first, possibly
> avoiding the need to diagnose the one with.
>
Right. I guess that's where the test's migration switch can come in handy.
Let's turn migration ON by default. If we see a failure, we can turn
OFF migration
and run the tests. I'll try to include some verbose printing for diagnosability.

Regards,
Raghavendra
> Thanks,
> drew
>