Re: [RFC PATCH 02/21] x86/virt/tdx: Enhance tdh_mem_page_aug() to support huge pages

From: Nikolay Borisov
Date: Thu Jun 19 2025 - 05:26:32 EST

Next message: syzbot: "Re: [syzbot] [mm?] KASAN: slab-use-after-free Read in do_sync_mmap_readahead"
Previous message: Peter Zijlstra: "Re: [PATCH v10 07/14] unwind_user/deferred: Make unwind deferral requests NMI-safe"
Next in thread: Yan Zhao: "Re: [RFC PATCH 02/21] x86/virt/tdx: Enhance tdh_mem_page_aug() to support huge pages"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 5/16/25 12:05, Yan Zhao wrote:

On Wed, May 14, 2025 at 02:52:49AM +0800, Edgecombe, Rick P wrote:

On Thu, 2025-04-24 at 11:04 +0800, Yan Zhao wrote:

Enhance the SEAMCALL wrapper tdh_mem_page_aug() to support huge pages.

Verify the validity of the level and ensure that the mapping range is fully
contained within the page folio.

As a conservative solution, perform CLFLUSH on all pages to be mapped into
the TD before invoking the SEAMCALL TDH_MEM_PAGE_AUG. This ensures that any
dirty cache lines do not write back later and clobber TD memory.

This should have a brief background on why it doesn't use the arg - what is
deficient today. Also, an explanation of how it will be used (i.e. what types of
pages will be passed)

Will do.

Signed-off-by: Xiaoyao Li <xiaoyao.li@xxxxxxxxx>
Signed-off-by: Isaku Yamahata <isaku.yamahata@xxxxxxxxx>
Signed-off-by: Yan Zhao <yan.y.zhao@xxxxxxxxx>
---
arch/x86/virt/vmx/tdx/tdx.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index f5e2a937c1e7..a66d501b5677 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -1595,9 +1595,18 @@ u64 tdh_mem_page_aug(struct tdx_td *td, u64 gpa, int level, struct page *page, u
.rdx = tdx_tdr_pa(td),
.r8 = page_to_phys(page),
};
+ unsigned long nr_pages = 1 << (level * 9);
+ struct folio *folio = page_folio(page);
+ unsigned long idx = 0;
u64 ret;
- tdx_clflush_page(page);
+ if (!(level >= TDX_PS_4K && level < TDX_PS_NR) ||
+ (folio_page_idx(folio, page) + nr_pages > folio_nr_pages(folio)))
+ return -EINVAL;

Shouldn't KVM not try to map a huge page in this situation? Doesn't seem like a
job for the SEAMCALL wrapper.

Ok. If the decision is to trust KVM and all potential callers, it's reasonable
to drop those checks.

+
+ while (nr_pages--)
+ tdx_clflush_page(nth_page(page, idx++));

clflush_cache_range() is:
static void tdx_clflush_page(struct page *page)
{
clflush_cache_range(page_to_virt(page), PAGE_SIZE);
}

So we have loops within loops... Better to add an arg to tdx_clflush_page() or
add a variant that takes one.

Ok.

One thing to note is that even with an extra arg, tdx_clflush_page() has to call
clflush_cache_range() page by page because with
"#if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)",
page virtual addresses are not necessarily contiguous.

What about Binbin's proposal [1]? i.e.,

while (nr_pages)
tdx_clflush_page(nth_page(page, --nr_pages));

What's the problem with using:

+ for (int i = 0; nr_pages; nr_pages--)
+ tdx_clflush_page(nth_page(page, i++))

The kernel now allows C99-style definition of variables inside a loop + it's clear how many times the loop has to be executed.

[1] https://lore.kernel.org/all/a7d0988d-037c-454f-bc6b-57e71b357488@xxxxxxxxxxxxxxx/

+
ret = seamcall_ret(TDH_MEM_PAGE_AUG, &args);
*ext_err1 = args.rcx;

Next message: syzbot: "Re: [syzbot] [mm?] KASAN: slab-use-after-free Read in do_sync_mmap_readahead"
Previous message: Peter Zijlstra: "Re: [PATCH v10 07/14] unwind_user/deferred: Make unwind deferral requests NMI-safe"
Next in thread: Yan Zhao: "Re: [RFC PATCH 02/21] x86/virt/tdx: Enhance tdh_mem_page_aug() to support huge pages"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]