Re: [PATCH v5 17/19] KVM: Terminate memslot walks via used_slots

From: Peter Xu
Date: Fri Feb 07 2020 - 15:39:21 EST


On Fri, Feb 07, 2020 at 10:33:25AM -0800, Sean Christopherson wrote:
> On Thu, Feb 06, 2020 at 04:09:44PM -0500, Peter Xu wrote:
> > On Tue, Jan 21, 2020 at 02:31:55PM -0800, Sean Christopherson wrote:
> > > @@ -9652,13 +9652,13 @@ int __x86_set_memory_region(struct kvm *kvm, int id, gpa_t gpa, u32 size)
> > > if (IS_ERR((void *)hva))
> > > return PTR_ERR((void *)hva);
> > > } else {
> > > - if (!slot->npages)
> > > + if (!slot || !slot->npages)
> > > return 0;
> > >
> > > - hva = 0;
> > > + hva = slot->userspace_addr;
> >
> > Is this intended?
>
> Yes. It's possible to allow VA=0 for userspace mappings. It's extremely
> uncommon, but possible. Therefore "hva == 0" shouldn't be used to
> indicate an invalid slot.

Note that this is the deletion path in __x86_set_memory_region() not
allocation. IIUC userspace_addr won't even be used in follow up code
path so it shouldn't really matter. Or am I misunderstood somewhere?

>
> > > + old_npages = slot->npages;
> > > }
> > >
> > > - old = *slot;
> > > for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) {
> > > struct kvm_userspace_memory_region m;
> > >
>
> ...
>
> > > @@ -869,63 +869,162 @@ static int kvm_create_dirty_bitmap(struct kvm_memory_slot *memslot)
> > > }
> > >
> > > /*
> > > - * Insert memslot and re-sort memslots based on their GFN,
> > > - * so binary search could be used to lookup GFN.
> > > - * Sorting algorithm takes advantage of having initially
> > > - * sorted array and known changed memslot position.
> > > + * Delete a memslot by decrementing the number of used slots and shifting all
> > > + * other entries in the array forward one spot.
> > > + */
> > > +static inline void kvm_memslot_delete(struct kvm_memslots *slots,
> > > + struct kvm_memory_slot *memslot)
> > > +{
> > > + struct kvm_memory_slot *mslots = slots->memslots;
> > > + int i;
> > > +
> > > + if (WARN_ON(slots->id_to_index[memslot->id] == -1))
> > > + return;
> > > +
> > > + slots->used_slots--;
> > > +
> > > + for (i = slots->id_to_index[memslot->id]; i < slots->used_slots; i++) {
> > > + mslots[i] = mslots[i + 1];
> > > + slots->id_to_index[mslots[i].id] = i;
> > > + }
> > > + mslots[i] = *memslot;
> > > + slots->id_to_index[memslot->id] = -1;
> > > +}
> > > +
> > > +/*
> > > + * "Insert" a new memslot by incrementing the number of used slots. Returns
> > > + * the new slot's initial index into the memslots array.
> > > + */
> > > +static inline int kvm_memslot_insert_back(struct kvm_memslots *slots)
> >
> > The naming here didn't help me to understand but a bit more
> > confused...
> >
> > How about "kvm_memslot_insert_end"? Or even unwrap it.
>
> It's not guaranteed to be the end, as there could be multiple unused
> entries at the back of the array. I agree the naming isn't perfect, but
> IMO it's the least crappy option and will be familiar to anyone with C++
> STL (and other languages?) experience. Arguably it would be better to
> follow kernel naming for lists, e.g. head/tail, but there are no
> convenient adverbs for the move helpers, e.g. kvm_memslot_move_backward()
> would be kvm_memslot_move_towards_tail().
>
> I'm very strongly opposed to unwrapping it.
>
> The code would look like this. Without a beefy comment, the high level
> semantics of the KVM_MR_CREATE case are not at all clear. Adding a
> comment gets messy because putting it above the entire if-else makes it
> difficult to understand that its *only* for the CREATE case, and I hate
> having multi-line comments in if-else statements without brackets.
>
> if (change == KVM_MR_CREATE)
> i = slots->used_slots++
> else
> i = kvm_memslot_move_backward(slots, memslot);

This is made too complicated, imho... A one-liner comment would be
clear enough to me. :)

Please feel free to keep the original code as you wish.

>
> > > +{
> > > + return slots->used_slots++;
> > > +}
> > > +
> > > +/*
> > > + * Move a changed memslot backwards in the array by shifting existing slots
> > > + * with a higher GFN toward the front of the array. Note, the changed memslot
> > > + * itself is not preserved in the array, i.e. not swapped at this time, only
> > > + * its new index into the array is tracked. Returns the changed memslot's
> > > + * current index into the memslots array.
> > > + */
> > > +static inline int kvm_memslot_move_backward(struct kvm_memslots *slots,
> > > + struct kvm_memory_slot *memslot)
> >
> > "backward" makes me feel like it's moving towards smaller index,
> > instead it's moving to bigger index. Same applies to "forward" below.
> > I'm not sure whether I'm the only one, though...
>
> Move forward towards the front, and backward towards the back. In the
> languages I am familiar with, e.g. C++ STL, JavaScript, Python, and Golang,
> front==container[0] and back==container[len() - 1].

OK.

>
> > > +{
> > > + struct kvm_memory_slot *mslots = slots->memslots;
> > > + int i;
> > > +
> > > + if (WARN_ON_ONCE(slots->id_to_index[memslot->id] == -1) ||
> > > + WARN_ON_ONCE(!slots->used_slots))
> > > + return -1;
> > > +
> > > + /*
> > > + * Move the target memslot backward in the array by shifting existing
> > > + * memslots with a higher GFN (than the target memslot) towards the
> > > + * front of the array.
> > > + */
> > > + for (i = slots->id_to_index[memslot->id]; i < slots->used_slots - 1; i++) {
> > > + if (memslot->base_gfn > mslots[i + 1].base_gfn)
> > > + break;
> > > +
> > > + WARN_ON_ONCE(memslot->base_gfn == mslots[i + 1].base_gfn);
> >
> > Will this trigger? Note that in __kvm_set_memory_region() we have
> > already checked overlap of memslots.
>
> If you screw up the code it will :-) In a perfect world, no WARN() will
> *ever* trigger. All of the added WARN_ON_ONCE() are to help the next poor
> soul that wants to modify this code.

I normally won't keep WARN_ON if it is 100% not triggering (100% here
I mean when e.g. it is checked twice so the 1st one will definitely
trigger first). My question is more like a pure question in case I
overlooked something. Please also feel free to keep it if you want.

>
> > > +
> > > + /* Shift the next memslot forward one and update its index. */
> > > + mslots[i] = mslots[i + 1];
> > > + slots->id_to_index[mslots[i].id] = i;
> > > + }
> > > + return i;
> > > +}
> > > @@ -1104,8 +1203,13 @@ int __kvm_set_memory_region(struct kvm *kvm,
>
> ...
>
> > > * when the memslots are re-sorted by update_memslots().
> > > */
> > > tmp = id_to_memslot(__kvm_memslots(kvm, as_id), id);
> > > - old = *tmp;
> > > - tmp = NULL;
> >
> > I was confused in that patch, then...
> >
> > > + if (tmp) {
> > > + old = *tmp;
> > > + tmp = NULL;
> >
> > ... now I still don't know why it needs to set to NULL?
>
> To make it abundantly clear that though shall not use @tmp, i.e. to force
> using the copy and not the pointer. Note, @tmp is also reused as an
> iterator below.

OK it still feels a bit strange, say, we can comment on that if you
wants to warn the others. The difference is probably no useless
instruction executed. But this is also trivial, I'll leave to the
others to judge.

Thanks,

--
Peter Xu