Re: [RFC PATCH v2 04/51] KVM: guest_memfd: Introduce KVM_GMEM_CONVERT_SHARED/PRIVATE ioctls
From: Ackerley Tng
Date: Fri May 16 2025 - 13:53:58 EST
Ira Weiny <ira.weiny@xxxxxxxxx> writes:
> Ackerley Tng wrote:
>
> [snip]
>
>> Signed-off-by: Ackerley Tng <ackerleytng@xxxxxxxxxx>
>>
>
> [snip]
>
>> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
>> index 590932499eba..f802116290ce 100644
>> --- a/virt/kvm/guest_memfd.c
>> +++ b/virt/kvm/guest_memfd.c
>> @@ -30,6 +30,10 @@ enum shareability {
>> };
>>
>> static struct folio *kvm_gmem_get_folio(struct inode *inode, pgoff_t index);
>> +static void kvm_gmem_invalidate_begin(struct kvm_gmem *gmem, pgoff_t start,
>> + pgoff_t end);
>> +static void kvm_gmem_invalidate_end(struct kvm_gmem *gmem, pgoff_t start,
>> + pgoff_t end);
>>
>> static struct kvm_gmem_inode_private *kvm_gmem_private(struct inode *inode)
>> {
>> @@ -85,6 +89,306 @@ static struct folio *kvm_gmem_get_shared_folio(struct inode *inode, pgoff_t inde
>> return kvm_gmem_get_folio(inode, index);
>> }
>>
>> +/**
>> + * kvm_gmem_shareability_store() - Sets shareability to @value for range.
>> + *
>> + * @mt: the shareability maple tree.
>> + * @index: the range begins at this index in the inode.
>> + * @nr_pages: number of PAGE_SIZE pages in this range.
>> + * @value: the shareability value to set for this range.
>> + *
>> + * Unlike mtree_store_range(), this function also merges adjacent ranges that
>> + * have the same values as an optimization.
>
> Is this an optimization or something which will be required to convert
> from shared back to private and back to a huge page mapping?
>
This is an optimization.
> If this is purely an optimization it might be best to leave it out for now
> to get functionality first.
>
I see this (small) optimization as part of using maple trees.
Fuad's version [1] uses xarrays and has 1 xarray entry per page
offset. I wanted to illustrate that by using maple trees, we can share
just 1 entry for a whole range, and part of that sharing involves
merging adjacent shareability entries that have the same value.
IIUC, these other users of maple trees also do some kind of
expansion/range merging:
+ VMAs in vma_expand() [2]
+ regcache in regcache_maple_write() [3]
> I have more to review but wanted to ask this.
>
> Ira
>
> [snip]
[1] https://lore.kernel.org/all/20250328153133.3504118-4-tabba@xxxxxxxxxx/
[2] https://elixir.bootlin.com/linux/v6.14.6/source/mm/vma.c#L1059
[3] https://elixir.bootlin.com/linux/v6.14.6/source/drivers/base/regmap/regcache-maple.c#L38