Re: [RFC PATCH 08/21] KVM: TDX: Increase/decrease folio ref for huge pages
From: Yan Zhao
Date: Mon Jun 16 2025 - 21:44:27 EST
On Tue, Jun 17, 2025 at 08:12:50AM +0800, Edgecombe, Rick P wrote:
> On Mon, 2025-06-16 at 17:59 +0800, Yan Zhao wrote:
> > If the above changes are agreeable, we could consider a more ambitious approach:
> > introducing an interface like:
> >
> > int guest_memfd_add_page_ref_count(gfn_t gfn, int nr);
> > int guest_memfd_dec_page_ref_count(gfn_t gfn, int nr);
>
> We talked about doing something like having tdx_hold_page_on_error() in
> guestmemfd with a proper name. The separation of concerns will be better if we
> can just tell guestmemfd, the page has an issue. Then guestmemfd can decide how
> to handle it (refcount or whatever).
Instead of using tdx_hold_page_on_error(), the advantage of informing
guest_memfd that TDX is holding a page at 4KB granularity is that, even if there
is a bug in KVM (such as forgetting to notify TDX to remove a mapping in
handle_removed_pt()), guest_memfd would be aware that the page remains mapped in
the TDX module. This allows guest_memfd to determine how to handle the
problematic page (whether through refcount adjustments or other methods) before
truncating it.
> >
> > This would allow guest_memfd to maintain an internal reference count for each
> > private GFN. TDX would call guest_memfd_add_page_ref_count() for mapping and
> > guest_memfd_dec_page_ref_count() after a successful unmapping. Before truncating
> > a private page from the filemap, guest_memfd could increase the real folio
> > reference count based on its internal reference count for the private GFN.
>
> What does this get us exactly? This is the argument to have less error prone
> code that can survive forgetting to refcount on error? I don't see that it is an
> especially special case.
Yes, for a less error prone code.
If this approach is considered too complex for an initial implementation, using
tdx_hold_page_on_error() is also a viable option.