Re: [PATCH 05/12] KVM: X86/MMU: Clear unsync bit directly in __mmu_unsync_walk()

From: Lai Jiangshan
Date: Thu Jul 21 2022 - 05:33:11 EST


On Wed, Jul 20, 2022 at 3:52 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:

> > ---
> > arch/x86/kvm/mmu/mmu.c | 22 +++++++++++++---------
> > 1 file changed, 13 insertions(+), 9 deletions(-)
> >
> > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > index f35fd5c59c38..2446ede0b7b9 100644
> > --- a/arch/x86/kvm/mmu/mmu.c
> > +++ b/arch/x86/kvm/mmu/mmu.c
> > @@ -1794,19 +1794,23 @@ static int __mmu_unsync_walk(struct kvm_mmu_page *sp,
> > return -ENOSPC;
> >
> > ret = __mmu_unsync_walk(child, pvec);
> > - if (!ret) {
> > - clear_unsync_child_bit(sp, i);
> > - continue;
> > - } else if (ret > 0) {
> > - nr_unsync_leaf += ret;
> > - } else
> > + if (ret < 0)
> > return ret;
> > - } else if (child->unsync) {
> > + nr_unsync_leaf += ret;
> > + }
> > +
> > + /*
> > + * Clear unsync bit for @child directly if @child is fully
> > + * walked and all the unsync shadow pages descended from
> > + * @child (including itself) are added into @pvec, the caller
> > + * must sync or zap all the unsync shadow pages in @pvec.
> > + */
> > + clear_unsync_child_bit(sp, i);
> > + if (child->unsync) {
> > nr_unsync_leaf++;
> > if (mmu_pages_add(pvec, child, i))
>
> This ordering is wrong, no? If the child itself is unsync and can't be added to
> @pvec, i.e. fails here, then clearing its bit in unsync_child_bitmap is wrong.

mmu_pages_add() can always successfully add the page to @pvec and
the caller needs to guarantee there is enough room to do so.

When it returns true, it means it will fail if you keep adding pages.

>
> I also dislike that that this patch obfuscates that a shadow page can't be unsync
> itself _and_ have unsync children (because only PG_LEVEL_4K can be unsync). In
> other words, keep the
>
> if (child->unsync_children) {
>
> } else if (child->unsync) {
>
> }
>

The code was not streamlined like this just because
I need to add some comments on clear_unsync_child_bit().

Duplicated clear_unsync_child_bit() would require
duplicated comments. I will use "See above" instead.