Re: [PATCH v3 00/21] TDX host kernel support

From: Dan Williams
Date: Thu May 05 2022 - 20:22:29 EST


On Thu, May 5, 2022 at 3:14 PM Kai Huang <kai.huang@xxxxxxxxx> wrote:
>
> Thanks for feedback!
>
> On Thu, 2022-05-05 at 06:51 -0700, Dan Williams wrote:
> > [ add Mike ]
> >
> >
> > On Thu, May 5, 2022 at 2:54 AM Kai Huang <kai.huang@xxxxxxxxx> wrote:
> > [..]
> > >
> > > Hi Dave,
> > >
> > > Sorry to ping (trying to close this).
> > >
> > > Given we don't need to consider kmem-hot-add legacy PMEM after TDX module
> > > initialization, I think for now it's totally fine to exclude legacy PMEMs from
> > > TDMRs. The worst case is when someone tries to use them as TD guest backend
> > > directly, the TD will fail to create. IMO it's acceptable, as it is supposedly
> > > that no one should just use some random backend to run TD.
> >
> > The platform will already do this, right?
> >
>
> In the current v3 implementation, we don't have any code to handle memory
> hotplug, therefore nothing prevents people from adding legacy PMEMs as system
> RAM using kmem driver. In order to guarantee all pages managed by page

That's the fundamental question I am asking why is "guarantee all
pages managed by page allocator are TDX memory". That seems overkill
compared to indicating the incompatibility after the fact.

> allocator are all TDX memory, the v3 implementation needs to always include
> legacy PMEMs as TDX memory so that even people truly add legacy PMEMs as system
> RAM, we can still guarantee all pages in page allocator are TDX memory.

Why?

> Of course, a side benefit of always including legacy PMEMs is people
> theoretically can use them directly as TD guest backend, but this is just a
> bonus but not something that we need to guarantee.
>
>
> > I don't understand why this
> > is trying to take proactive action versus documenting the error
> > conditions and steps someone needs to take to avoid unconvertible
> > memory. There is already the CONFIG_HMEM_REPORTING that describes
> > relative performance properties between initiators and targets, it
> > seems fitting to also add security properties between initiators and
> > targets so someone can enumerate the numa-mempolicy that avoids
> > unconvertible memory.
>
> I don't think there's anything related to performance properties here. The only
> goal here is to make sure all pages in page allocator are TDX memory pages.

Please reconsider or re-clarify that goal.

>
> >
> > No, special casing in hotplug code paths needed.
> >
> > >
> > > I think w/o needing to include legacy PMEM, it's better to get all TDX memory
> > > blocks based on memblock, but not e820. The pages managed by page allocator are
> > > from memblock anyway (w/o those from memory hotplug).
> > >
> > > And I also think it makes more sense to introduce 'tdx_memblock' and
> > > 'tdx_memory' data structures to gather all TDX memory blocks during boot when
> > > memblock is still alive. When TDX module is initialized during runtime, TDMRs
> > > can be created based on the 'struct tdx_memory' which contains all TDX memory
> > > blocks we gathered based on memblock during boot. This is also more flexible to
> > > support other TDX memory from other sources such as CLX memory in the future.
> > >
> > > Please let me know if you have any objection? Thanks!
> >
> > It's already the case that x86 maintains sideband structures to
> > preserve memory after exiting the early memblock code.
> >
>
> May I ask what data structures are you referring to?

struct numa_meminfo.

> Btw, the purpose of 'tdx_memblock' and 'tdx_memory' is not only just to preserve
> memblock info during boot. It is also used to provide a common data structure
> that the "constructing TDMRs" code can work on. If you look at patch 11-14, the
> logic (create TDMRs, allocate PAMTs, sets up reserved areas) around how to
> construct TDMRs doesn't have hard dependency on e820. If we construct TDMRs
> based on a common 'tdx_memory' like below:
>
> int construct_tdmrs(struct tdx_memory *tmem, ...);
>
> It would be much easier to support other TDX memory resources in the future.

"in the future" is a prompt to ask "Why not wait until that future /
need arrives before adding new infrastructure?"

> The thing I am not sure is Dave wants to keep the code minimal (as this series
> is already very big in terms of LoC) to make TDX running, and for now in
> practice there's only system RAM during boot is TDX capable, so I am not sure we
> should introduce those structures now.
>
> > Mike, correct
> > me if I am wrong, but adding more is less desirable than just keeping
> > the memblock around?