>> @@ -1767,6 +1874,14 @@ static void kvm_mmu_commit_zap_page(struct kvm *kvm,
>>
>> kvm_flush_remote_tlbs(kvm);
>>
>> + if (atomic_read(&kvm->arch.reader_counter)) {
>> + kvm_mmu_isolate_pages(invalid_list);
>> + sp = list_first_entry(invalid_list, struct kvm_mmu_page, link);
>> + list_del_init(invalid_list);
>> + call_rcu(&sp->rcu, free_pages_rcu);
>> + return;
>> + }
>> +
>
> I think we should do this unconditionally. The cost of ping-ponging the shared cache line containing reader_counter will increase with large smp counts. On the other hand, zap_page is very rare, so it can be a little slower. Also, less code paths = easier to understand.
>
On soft mmu, zap_page is very frequently, it can cause performance regression in my test.