Re: [GIT PULL] KVM: x86: MMU(ish) fixes for 6.8
From: Paolo Bonzini
Date: Sat Mar 09 2024 - 11:49:45 EST
On Fri, Feb 23, 2024 at 10:15 PM Sean Christopherson <seanjc@googlecom> wrote:
>
> Two more MMU-related fixes for 6.8. The first, and worst, fixes a data
> corruption bug during live migration due to KVM failing to mark a memslot
> dirty when emulating an atomic access. Luckily, our userspace caught the
> corruption during checksumming after the final pause, but I've no idea if
> QEMU-based VMs have such protection.
>
> The second fixes a long-standing, but recently exposed, issue where yielding
> mmu_lock to vCPUs attempting to fault in memory that is _currently_ being
> zapped/modified can bog down the invalidation task due it constantly yielding
> to vCPUS (which end up doing nothing).
>
> The following changes since commit 9895ceeb5cd61092f147f8d611e2df575879dd6f:
>
> Merge tag 'kvmarm-fixes-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD (2024-02-16 12:02:38 -0500)
>
> are available in the Git repository at:
>
> https://github.com/kvm-x86/linux.git tags/kvm-x86-fixes-6.8-2
>
> for you to fetch changes up to d02c357e5bfa7dfd618b7b3015624beb71f58f1f:
>
> KVM: x86/mmu: Retry fault before acquiring mmu_lock if mapping is changing (2024-02-23 10:14:34 -0800)
Pulled, thanks.
Paolo
> ----------------------------------------------------------------
> KVM x86 fixes for 6.8, round 2:
>
> - When emulating an atomic access, mark the gfn as dirty in the memslot
> to fix a bug where KVM could fail to mark the slot as dirty during live
> migration, ultimately resulting in guest data corruption due to a dirty
> page not being re-copied from the source to the target.
>
> - Check for mmu_notifier invalidation events before faulting in the pfn,
> and before acquiring mmu_lock, to avoid unnecessary work and lock
> contention. Contending mmu_lock is especially problematic on preemptible
> kernels, as KVM may yield mmu_lock in response to the contention, which
> severely degrades overall performance due to vCPUs making it difficult
> for the task that triggered invalidation to make forward progress.
>
> Note, due to another kernel bug, this fix isn't limited to preemtible
> kernels, as any kernel built with CONFIG_PREEMPT_DYNAMIC=y will yield
> contended rwlocks and spinlocks.
>
> https://lore.kernel.org/all/20240110214723.695930-1-seanjc@xxxxxxxxxx
>
> ----------------------------------------------------------------
> Sean Christopherson (2):
> KVM: x86: Mark target gfn of emulated atomic instruction as dirty
> KVM: x86/mmu: Retry fault before acquiring mmu_lock if mapping is changing
>
> arch/x86/kvm/mmu/mmu.c | 42 ++++++++++++++++++++++++++++++++++++++++++
> arch/x86/kvm/x86.c | 10 ++++++++++
> include/linux/kvm_host.h | 26 ++++++++++++++++++++++++++
> 3 files changed, 78 insertions(+)
>