Re: [PATCH v5 15/19] KVM: Provide common implementation for generic dirty log functions

From: Vitaly Kuznetsov
Date: Mon Feb 17 2020 - 10:39:53 EST


Sean Christopherson <sean.j.christopherson@xxxxxxxxx> writes:

> On Fri, Feb 07, 2020 at 07:53:34PM -0500, Peter Xu wrote:
>> On Fri, Feb 07, 2020 at 04:42:33PM -0800, Sean Christopherson wrote:
>> > On Fri, Feb 07, 2020 at 07:18:32PM -0500, Peter Xu wrote:
>> > > On Fri, Feb 07, 2020 at 11:45:32AM -0800, Sean Christopherson wrote:
>> > > > +Vitaly for HyperV
>> > > >
>> > > > On Thu, Feb 06, 2020 at 04:41:06PM -0500, Peter Xu wrote:
>> > > > > On Thu, Feb 06, 2020 at 01:21:20PM -0800, Sean Christopherson wrote:
>> > > > > > On Thu, Feb 06, 2020 at 03:02:00PM -0500, Peter Xu wrote:
>> > > > > > > But that matters to this patch because if MIPS can use
>> > > > > > > kvm_flush_remote_tlbs(), then we probably don't need this
>> > > > > > > arch-specific hook any more and we can directly call
>> > > > > > > kvm_flush_remote_tlbs() after sync dirty log when flush==true.
>> > > > > >
>> > > > > > Ya, the asid_flush_mask in kvm_vz_flush_shadow_all() is the only thing
>> > > > > > that prevents calling kvm_flush_remote_tlbs() directly, but I have no
>> > > > > > clue as to the important of that code.
>> > > > >
>> > > > > As said above I think the x86 lockdep is really not necessary, then
>> > > > > considering MIPS could be the only one that will use the new hook
>> > > > > introduced in this patch... Shall we figure that out first?
>> > > >
>> > > > So I prepped a follow-up patch to make kvm_arch_dirty_log_tlb_flush() a
>> > > > MIPS-only hook and use kvm_flush_remote_tlbs() directly for arm and x86,
>> > > > but then I realized x86 *has* a hook to do a precise remote TLB flush.
>> > > > There's even an existing kvm_flush_remote_tlbs_with_address() call on a
>> > > > memslot, i.e. this exact scenario. So arguably, x86 should be using the
>> > > > more precise flush and should keep kvm_arch_dirty_log_tlb_flush().
>> > > >
>> > > > But, the hook is only used when KVM is running as an L1 on top of HyperV,
>> > > > and I assume dirty logging isn't used much, if at all, for L1 KVM on
>> > > > HyperV?
>> > > >
>> > > > I see three options:
>> > > >
>> > > > 1. Make kvm_arch_dirty_log_tlb_flush() MIPS-only and call
>> > > > kvm_flush_remote_tlbs() directly for arm and x86. Add comments to
>> > > > explain when an arch should implement kvm_arch_dirty_log_tlb_flush().
>> > > >
>> > > > 2. Change x86 to use kvm_flush_remote_tlbs_with_address() when flushing
>> > > > a memslot after the dirty log is grabbed by userspace.
>> > > >
>> > > > 3. Keep the resulting code as is, but add a comment in x86's
>> > > > kvm_arch_dirty_log_tlb_flush() to explain why it uses
>> > > > kvm_flush_remote_tlbs() instead of the with_address() variant.
>> > > >
>> > > > I strongly prefer to (2) or (3), but I'll defer to Vitaly as to which of
>> > > > those is preferable.
>> > > >
>> > > > I don't like (1) because (a) it requires more lines code (well comments),
>> > > > to explain why kvm_flush_remote_tlbs() is the default, and (b) it would
>> > > > require even more comments, which would be x86-specific in generic KVM,
>> > > > to explain why x86 doesn't use its with_address() flush, or we'd lost that
>> > > > info altogether.
>> > > >
>> > >
>> > > I proposed the 4th solution here:
>> > >
>> > > https://lore.kernel.org/kvm/20200207223520.735523-1-peterx@xxxxxxxxxx/
>> > >
>> > > I'm not sure whether that's acceptable, but if it can, then we can
>> > > drop the kvm_arch_dirty_log_tlb_flush() hook, or even move on to
>> > > per-slot tlb flushing.
>> >
>> > This effectively is per-slot TLB flushing, it just has a different name.
>> > I.e. s/kvm_arch_dirty_log_tlb_flush/kvm_arch_flush_remote_tlbs_memslot.
>> > I'm not opposed to that name change. And on second and third glance, I
>> > probably prefer it. That would more or less follow the naming of
>> > kvm_arch_flush_shadow_all() and kvm_arch_flush_shadow_memslot().
>>
>> Note that the major point of the above patchset is not about doing tlb
>> flush per-memslot or globally. It's more about whether we can provide
>> a common entrance for TLB flushing. Say, after that series, we should
>> be able to flush TLB on all archs (majorly, including MIPS) as:
>>
>> kvm_flush_remote_tlbs(kvm);
>>
>> And with the same idea we can also introduce the ranged version.
>>
>> >
>> > I don't want to go straight to kvm_arch_flush_remote_tlb_with_address()
>> > because that loses the important distinction (on x86) that slots_lock is
>> > expected to be held.
>>
>> Sorry I'm still puzzled on why that lockdep is so important and
>> special for x86... For example, what if we move that lockdep to the
>> callers of the kvm_arch_dirty_log_tlb_flush() calls so it protects
>> even more arch (where we do get/clear dirty log)? IMHO the callers
>> must be with the slots_lock held anyways no matter for x86 or not.
>
>
> Following the breadcrumbs leads to the comment in
> kvm_mmu_slot_remove_write_access(), which says:
>
> /*
> * kvm_mmu_slot_remove_write_access() and kvm_vm_ioctl_get_dirty_log()
> * which do tlb flush out of mmu-lock should be serialized by
> * kvm->slots_lock otherwise tlb flush would be missed.
> */
>
> I.e. write-protecting a memslot and grabbing the dirty log for the memslot
> need to be serialized. It's quite obvious *now* that get_dirty_log() holds
> slots_lock, but the purpose of lockdep assertions isn't just to verify the
> current functionality, it's to help ensure the correctness for future code
> and to document assumptions in the code.
>
> Digging deeper, there are four functions, all related to dirty logging, in
> the x86 mmu that basically open code what x86's
> kvm_arch_flush_remote_tlbs_memslot() would look like if it uses the range
> based flushing.
>
> Unless it's functionally incorrect (Vitaly?), going with option (2) and
> naming the hook kvm_arch_flush_remote_tlbs_memslot() seems like the obvious
> choice, e.g. the final cleanup gives this diff stat:

(I apologize again for not replying in time)

I think this is a valid approach and your option (2) would also be my
choice. I also don't think there's going to be a problem when (if)
Hyper-V adds support for PML (eVMCSv2?).

>
> arch/x86/kvm/mmu/mmu.c | 34 +++++++++-------------------------
> 1 file changed, 9 insertions(+), 25 deletions(-)
>

Looks nice :-)

--
Vitaly