Re: [RFC PATCH 02/21] x86/virt/tdx: Enhance tdh_mem_page_aug() to support huge pages

From: Nikolay Borisov
Date: Thu Jun 19 2025 - 05:26:32 EST




On 5/16/25 12:05, Yan Zhao wrote:
On Wed, May 14, 2025 at 02:52:49AM +0800, Edgecombe, Rick P wrote:
On Thu, 2025-04-24 at 11:04 +0800, Yan Zhao wrote:
Enhance the SEAMCALL wrapper tdh_mem_page_aug() to support huge pages.

Verify the validity of the level and ensure that the mapping range is fully
contained within the page folio.

As a conservative solution, perform CLFLUSH on all pages to be mapped into
the TD before invoking the SEAMCALL TDH_MEM_PAGE_AUG. This ensures that any
dirty cache lines do not write back later and clobber TD memory.

This should have a brief background on why it doesn't use the arg - what is
deficient today. Also, an explanation of how it will be used (i.e. what types of
pages will be passed)
Will do.


Signed-off-by: Xiaoyao Li <xiaoyao.li@xxxxxxxxx>
Signed-off-by: Isaku Yamahata <isaku.yamahata@xxxxxxxxx>
Signed-off-by: Yan Zhao <yan.y.zhao@xxxxxxxxx>
---
 arch/x86/virt/vmx/tdx/tdx.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index f5e2a937c1e7..a66d501b5677 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -1595,9 +1595,18 @@ u64 tdh_mem_page_aug(struct tdx_td *td, u64 gpa, int level, struct page *page, u
  .rdx = tdx_tdr_pa(td),
  .r8 = page_to_phys(page),
  };
+ unsigned long nr_pages = 1 << (level * 9);
+ struct folio *folio = page_folio(page);
+ unsigned long idx = 0;
  u64 ret;
- tdx_clflush_page(page);
+ if (!(level >= TDX_PS_4K && level < TDX_PS_NR) ||
+     (folio_page_idx(folio, page) + nr_pages > folio_nr_pages(folio)))
+ return -EINVAL;

Shouldn't KVM not try to map a huge page in this situation? Doesn't seem like a
job for the SEAMCALL wrapper.
Ok. If the decision is to trust KVM and all potential callers, it's reasonable
to drop those checks.

+
+ while (nr_pages--)
+ tdx_clflush_page(nth_page(page, idx++));

clflush_cache_range() is:
static void tdx_clflush_page(struct page *page)
{
clflush_cache_range(page_to_virt(page), PAGE_SIZE);
}

So we have loops within loops... Better to add an arg to tdx_clflush_page() or
add a variant that takes one.
Ok.

One thing to note is that even with an extra arg, tdx_clflush_page() has to call
clflush_cache_range() page by page because with
"#if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)",
page virtual addresses are not necessarily contiguous.

What about Binbin's proposal [1]? i.e.,

while (nr_pages)
tdx_clflush_page(nth_page(page, --nr_pages));

What's the problem with using:

+ for (int i = 0; nr_pages; nr_pages--)
+ tdx_clflush_page(nth_page(page, i++))


The kernel now allows C99-style definition of variables inside a loop + it's clear how many times the loop has to be executed.

[1] https://lore.kernel.org/all/a7d0988d-037c-454f-bc6b-57e71b357488@xxxxxxxxxxxxxxx/

+
  ret = seamcall_ret(TDH_MEM_PAGE_AUG, &args);
  *ext_err1 = args.rcx;