Re: [PATCH v4 0/4] kvm/arm: New VMID allocator based on asid

From: Catalin Marinas
Date: Tue Jan 18 2022 - 07:33:08 EST


Hi Shameer,

On Mon, Nov 22, 2021 at 12:18:40PM +0000, Shameer Kolothum wrote:
>  -TLA+ model. Modified the asidalloc model to incorporate the new
> VMID algo. The main differences are,
>   -flush_tlb_all() instead of local_tlb_flush_all() on rollover.
>   -Introduced INVALID_VMID and vCPU Sched Out logic.
>   -No CnP (Removed UniqueASIDAllCPUs & UniqueASIDActiveTask invariants).
>   -Removed  UniqueVMIDPerCPU invariant for now as it looks like
> because of the speculative fetching with flush_tlb_all() there
> is a small window where this gets triggered. If I change the
> logic back to local_flush_tlb_all(), UniqueVMIDPerCPU seems to
> be fine. With my limited knowledge on TLA+ model, it is not
> clear to me whether this is a problem with the above logic
> or the VMID model implementation. Really appreciate any help
> with the model.
> The initial VMID TLA+ model is here,
> https://github.com/shamiali2008/kernel-tla/tree/private-vmidalloc-v1

I only had a brief look at the TLA+ model and I don't understand why you
have a separate 'shed_out' process. It would run in parallel with the
'sched' but AFAICT you can't really schedule a guest out while you are
in the middle of scheduling it in. I'd rather use the same 'sched'
process and either schedule in an inactive task or schedule out an
active one for a given CPU.

Also active_vmids[] for example is defined on the CPUS domain but you
call vcpu_sched_out() from a process that's not in the CPUS domain but
the SCHED_OUT one.

Regarding UniqueVMIDPerCPU, I think we need to figure out why it
happens. The fact that flush_tlb_all() was made to simulate the
speculative TLB loads is not relevant. In a different spec I have,
arm64kpti.tla, I just used another process that invokes an update_tlbs()
macro so that it can happen at any time. I didn't bother to update the
ASID spec in a similar way but it may be useful. The corresponding
UniqueASIDPerCPU meant that for any two TLB entries on a single CPU, if
they correspond to different tasks (pgd), they should have different
ASIDs. That's a strong requirement, otherwise we end up with the wrong
translation.

Why did you remove the CnP? Do we have this disabled for KVM guests?

--
Catalin