RE: [PATCH v4 0/4] kvm/arm: New VMID allocator based on asid

From: Shameerali Kolothum Thodi
Date: Wed Jan 19 2022 - 04:23:37 EST


Hi Catalin,

> -----Original Message-----
> From: Catalin Marinas [mailto:catalin.marinas@xxxxxxx]
> Sent: 18 January 2022 12:33
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@xxxxxxxxxx>
> Cc: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx; kvmarm@xxxxxxxxxxxxxxxxxxxxx;
> linux-kernel@xxxxxxxxxxxxxxx; maz@xxxxxxxxxx; will@xxxxxxxxxx;
> james.morse@xxxxxxx; julien.thierry.kdev@xxxxxxxxx;
> suzuki.poulose@xxxxxxx; jean-philippe@xxxxxxxxxx;
> Alexandru.Elisei@xxxxxxx; qperret@xxxxxxxxxx; Jonathan Cameron
> <jonathan.cameron@xxxxxxxxxx>; Linuxarm <linuxarm@xxxxxxxxxx>
> Subject: Re: [PATCH v4 0/4] kvm/arm: New VMID allocator based on asid
>
> Hi Shameer,
>
> On Mon, Nov 22, 2021 at 12:18:40PM +0000, Shameer Kolothum wrote:
> >  -TLA+ model. Modified the asidalloc model to incorporate the new
> > VMID algo. The main differences are,
> >   -flush_tlb_all() instead of local_tlb_flush_all() on rollover.
> >   -Introduced INVALID_VMID and vCPU Sched Out logic.
> >   -No CnP (Removed UniqueASIDAllCPUs & UniqueASIDActiveTask
> invariants).
> >   -Removed  UniqueVMIDPerCPU invariant for now as it looks like
> > because of the speculative fetching with flush_tlb_all() there
> > is a small window where this gets triggered. If I change the
> > logic back to local_flush_tlb_all(), UniqueVMIDPerCPU seems to
> > be fine. With my limited knowledge on TLA+ model, it is not
> > clear to me whether this is a problem with the above logic
> > or the VMID model implementation. Really appreciate any help
> > with the model.
> > The initial VMID TLA+ model is here,
> > https://github.com/shamiali2008/kernel-tla/tree/private-vmidalloc-v1
>
> I only had a brief look at the TLA+ model and I don't understand why you
> have a separate 'shed_out' process. It would run in parallel with the
> 'sched' but AFAICT you can't really schedule a guest out while you are
> in the middle of scheduling it in. I'd rather use the same 'sched'
> process and either schedule in an inactive task or schedule out an
> active one for a given CPU.
>
> Also active_vmids[] for example is defined on the CPUS domain but you
> call vcpu_sched_out() from a process that's not in the CPUS domain but
> the SCHED_OUT one.

Many thanks for taking a look. My bad!. The 'sched_out' would indeed run in parallel
and defeat the purpose. I must say I was really confused by the TLA+ syntax and
is still not very confident about it.

Based on the above suggestion, I have modified it as below,

\* vCPU is scheduled out by KVM
macro vcpu_sched_out() {
active_kvm[self].task := 0;
active_vmids[self] := INVALID_VMID;
}
....

\* About to run a Guest VM
process (sched \in CPUS)
{
sched_loop:
while (TRUE) {
with (t \in TASKS) {
if (t # ActiveTask(self))
call kvm_arm_vmid_update(t);
else
vcpu_sched_out();
}
}
}

Please let me know if this is ok.

> Regarding UniqueVMIDPerCPU, I think we need to figure out why it
> happens. The fact that flush_tlb_all() was made to simulate the
> speculative TLB loads is not relevant. In a different spec I have,
> arm64kpti.tla, I just used another process that invokes an update_tlbs()
> macro so that it can happen at any time. I didn't bother to update the
> ASID spec in a similar way but it may be useful.

Ok. I will take a look at that and try that.

The corresponding
> UniqueASIDPerCPU meant that for any two TLB entries on a single CPU, if
> they correspond to different tasks (pgd), they should have different
> ASIDs. That's a strong requirement, otherwise we end up with the wrong
> translation.

Yes, I understand that it is a strong requirement. Also, I thought this is something
that will trigger easily with the test setup I had with the real hardware. But testing
stayed on for days, so I was not sure it is a problem with the TLA+ implementation
or not.

>
> Why did you remove the CnP? Do we have this disabled for KVM guests?

I removed CnP related Invariants to simplify things for the first version. Also not sure
what specific changes we need to do for CnP here. Do we still need that switching to
global pg_dir to prevent any speculative reading? I didn't see that being done in KVM
anywhere at the moment. Maybe I am missing something.

On a side note, In my setup, the CnP=TRUE case for asidalloc.tla now fails with,
"Error: Invariant TLBEmptyInvalPTE is violated.". Please check.

Thanks,
Shameer

> --
> Catalin