Re: [PATCH] RISC-V: KVM: Avoid re-acquiring memslot in kvm_riscv_gstage_map()

From: Sean Christopherson
Date: Tue Jun 17 2025 - 10:39:57 EST


On Sun, Jun 15, 2025, Anup Patel wrote:
> On Sat, Jun 14, 2025 at 3:59 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
> >
> > On Thu, Jun 12, 2025, Andrew Jones wrote:
> > > On Wed, Jun 11, 2025 at 09:17:36AM -0700, Sean Christopherson wrote:
> > > > Looks like y'all also have a bug where an -EEXIST will be returned to userspace,
> > > > and will generate what's probably a spurious kvm_err() message.
> > >
> > > On 32-bit riscv, due to losing the upper bits of the physical address? Or
> > > is there yet another thing to fix?
> >
> > Another bug, I think. gstage_set_pte() returns -EEXIST if a PTE exists, and I
> > _assume_ that's supposed to be benign? But this code returns it blindly:
>
> gstage_set_pte() returns -EEXIST only when it was expecting a non-leaf
> PTE at a particular level but got a leaf PTE

Right, but isn't returning -EEXIST all the way to userspace undesirable behavior?

E.g. in this sequence, KVM will return -EEXIST and incorrectly terminate the VM
(assuming the VMM doesn't miraculously recover somehow):

1. Back the VM with HugeTLBFS
2. Fault-in memory, i.e. create hugepage mappings
3. Enable KVM_MEM_LOG_DIRTY_PAGES
4. Write-protection fault, kvm_riscv_gstage_map() tries to create a writable
non-huge mapping.
5. gstage_set_pte() encounters the huge leaf PTE before reaching the target
level, and returns -EEXIST.

AFAICT, gstage_wp_memory_region() doesn't split/shatter/demote hugepages, it
simply clears _PAGE_WRITE.

It's entirely possible I'm missing something that makes the above scenario
impossible in practice, but at this point I'm genuinely curious :-)