Re: [PATCH v5 12/22] x86/virt/tdx: Convert all memory regions in memblock to TDX memory

From: Juergen Gross
Date: Thu Jul 07 2022 - 10:36:28 EST


On 07.07.22 16:26, Dave Hansen wrote:
On 6/26/22 23:16, Kai Huang wrote:
On Fri, 2022-06-24 at 12:40 -0700, Dave Hansen wrote:
+/*
+ * Walks over all memblock memory regions that are intended to be
+ * converted to TDX memory. Essentially, it is all memblock memory
+ * regions excluding the low memory below 1MB.
+ *
+ * This is because on some TDX platforms the low memory below 1MB is
+ * not included in CMRs. Excluding the low 1MB can still guarantee
+ * that the pages managed by the page allocator are always TDX memory,
+ * as the low 1MB is reserved during kernel boot and won't end up to
+ * the ZONE_DMA (see reserve_real_mode()).
+ */
+#define memblock_for_each_tdx_mem_pfn_range(i, p_start, p_end, p_nid) \
+ for_each_mem_pfn_range(i, MAX_NUMNODES, p_start, p_end, p_nid) \
+ if (!pfn_range_skip_lowmem(p_start, p_end))

Let's summarize where we are at this point:

1. All RAM is described in memblocks
2. Some memblocks are reserved and some are free
3. The lower 1MB is marked reserved
4. for_each_mem_pfn_range() walks all reserved and free memblocks, so we
have to exclude the lower 1MB as a special case.

That seems superficially rather ridiculous. Shouldn't we just pick a
memblock iterator that skips the 1MB? Surely there is such a thing.

Perhaps you are suggesting we should always loop the _free_ ranges so we don't
need to care about the first 1MB which is reserved?

The problem is some reserved memory regions are actually later freed to the page
allocator, for example, initrd. So to cover all those 'late-freed-reserved-
regions', I used for_each_mem_pfn_range(), instead of for_each_free_mem_range().

Why not just entirely remove the lower 1MB from the memblock structure
on TDX systems? Do something equivalent to adding this on the kernel
command line:

memmap=1M$0x0

Btw, I do have a checkpatch warning around this code:

ERROR: Macros with complex values should be enclosed in parentheses
#109: FILE: arch/x86/virt/vmx/tdx/tdx.c:377:
+#define memblock_for_each_tdx_mem_pfn_range(i, p_start, p_end, p_nid) \
+ for_each_mem_pfn_range(i, MAX_NUMNODES, p_start, p_end, p_nid) \
+ if (!pfn_range_skip_lowmem(p_start, p_end))

But it looks like a false positive to me.

I think it doesn't like the if().

I think it is right.

Consider:

if (a)
memblock_for_each_tdx_mem_pfn_range(...)
func();
else
other_func();


Juergen

Attachment: OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature
Description: OpenPGP digital signature