[PATCH v2] KVM: x86/mmu: Update number of zapped pages even if page list is stable

From: Sean Christopherson
Date: Mon Nov 29 2021 - 18:52:43 EST


When zapping obsolete pages, update the running count of zapped pages
regardless of whether or not the list has become unstable due to zapping
a shadow page with its own child shadow pages. If the VM is backed by
mostly 4kb pages, KVM can zap an absurd number of SPTEs without bumping
the batch count and thus without yielding. In the worst case scenario,
this can cause a soft lokcup.

watchdog: BUG: soft lockup - CPU#12 stuck for 22s! [dirty_log_perf_:13020]
RIP: 0010:workingset_activation+0x19/0x130
mark_page_accessed+0x266/0x2e0
kvm_set_pfn_accessed+0x31/0x40
mmu_spte_clear_track_bits+0x136/0x1c0
drop_spte+0x1a/0xc0
mmu_page_zap_pte+0xef/0x120
__kvm_mmu_prepare_zap_page+0x205/0x5e0
kvm_mmu_zap_all_fast+0xd7/0x190
kvm_mmu_invalidate_zap_pages_in_memslot+0xe/0x10
kvm_page_track_flush_slot+0x5c/0x80
kvm_arch_flush_shadow_memslot+0xe/0x10
kvm_set_memslot+0x1a8/0x5d0
__kvm_set_memory_region+0x337/0x590
kvm_vm_ioctl+0xb08/0x1040

Fixes: fbb158cb88b6 ("KVM: x86/mmu: Revert "Revert "KVM: MMU: zap pages in batch""")
Reported-by: David Matlack <dmatlack@xxxxxxxxxx>
Reviewed-by: Ben Gardon <bgardon@xxxxxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx
Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
---

v2:
- Rebase to kvm/master, commit 30d7c5d60a88 ("KVM: SEV: expose...")
- Collect Ben's review, modulo bad splat.
- Copy+paste the correct splat and symptom. [David].

@David, I kept the unstable declaration out of the loop, mostly because I
really don't like putting declarations in loops, but also because
nr_zapped is declared out of the loop and I didn't want to change that
unnecessarily or make the code inconsistent.

arch/x86/kvm/mmu/mmu.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 0c839ee1282c..208c892136bf 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -5576,6 +5576,7 @@ static void kvm_zap_obsolete_pages(struct kvm *kvm)
{
struct kvm_mmu_page *sp, *node;
int nr_zapped, batch = 0;
+ bool unstable;

restart:
list_for_each_entry_safe_reverse(sp, node,
@@ -5607,11 +5608,12 @@ static void kvm_zap_obsolete_pages(struct kvm *kvm)
goto restart;
}

- if (__kvm_mmu_prepare_zap_page(kvm, sp,
- &kvm->arch.zapped_obsolete_pages, &nr_zapped)) {
- batch += nr_zapped;
+ unstable = __kvm_mmu_prepare_zap_page(kvm, sp,
+ &kvm->arch.zapped_obsolete_pages, &nr_zapped);
+ batch += nr_zapped;
+
+ if (unstable)
goto restart;
- }
}

/*
--
2.34.0.rc2.393.gf8c9666880-goog