Re: [PATCH v8 07/16] x86/virt/tdx: Use all system memory when initializing TDX module as TDX memory

From: Huang, Kai
Date: Thu Jan 12 2023 - 06:43:13 EST


On Mon, 2023-01-09 at 08:51 -0800, Dave Hansen wrote:
> > > > > > +       list_for_each_entry(tmb, &tdx_memlist, list) {
> > > > > > +               /*
> > > > > > +                * The new range is TDX memory if it is fully
> > > > > > covered by
> > > > > > +                * any TDX memory block.
> > > > > > +                *
> > > > > > +                * Note TDX memory blocks are originated from
> > > > > > memblock
> > > > > > +                * memory regions, which can only be contiguous when
> > > > > > two
> > > > > > +                * regions have different NUMA nodes or flags. 
> > > > > > Therefore
> > > > > > +                * the new range cannot cross multiple TDX memory
> > > > > > blocks.
> > > > > > +                */
> > > > > > +               if (start_pfn >= tmb->start_pfn && end_pfn <= tmb-
> > > > > > >end_pfn)
> > > > > > +                       return true;
> > > > > > +       }
> > > > > > +       return false;
> > > > > > +}
> > > >
> > > > I don't really like that comment.  It should first state its behavior
> > > > and assumptions, like:
> > > >
> > > >      This check assumes that the start_pfn<->end_pfn range does not
> > > >      cross multiple tdx_memlist entries.
> > > >
> > > > Only then should it describe why that is OK:
> > > >
> > > >      A single memory hotplug even across mutliple memblocks (from
> > > >      which tdx_memlist entries are derived) is impossible.  ... then
> > > >      actually explain
> > > >
> >
> > How about below?
> >
> >          /*
> >           * This check assumes that the start_pfn<->end_pfn range does not
> > cross
> >           * multiple tdx_memlist entries. A single memory hotplug event
> > across
> >           * multiple memblocks (from which tdx_memlist entries are derived)
> > is
> >           * impossible. That means start_pfn<->end_pfn range cannot exceed a
> >           * tdx_memlist entry, and the new range is TDX memory if it is
> > fully
> >           * covered by any tdx_memlist entry.
> >           */
>
> I was hoping you would actually explain why it is impossible.
>
> Is there something fundamental that keeps a memory area that spans two
> nodes from being removed and then a new area added that is comprised of
> a single node?
>
> Boot time:
>
> | memblock  |  memblock |
> <--Node=0--> <--Node=1-->
>
> Funky hotplug... nothing to see here, then:
>
> <--------Node=2-------->
>
> I would believe that there is no current bare-metal TDX system that has
> an implementation like this.  But, the comments above speak like it's
> fundamentally impossible.  That should be clarified.
>
> In other words, that comment talks about memblock attributes as being
> the core underlying reason that that simplified check is OK.  Is that
> it, or is it really the reduced hotplug feature set on TDX systems?

Hi Dave,

I think I have been forgetting that we have switched to reject non-TDX memory in
memory online, but not in memory hot-add.  

Memory offline/online is done on granularity of 'struct memory_block', but not
memblock. In fact, the hotpluggable memory region (one memblock) must be
multiple of memory_block, and a "to-be-online" memory_block must be full range
memory (no memory hole).

So if I am not missing something, IIUC that means if the start_pfn<->end_pfn is
TDX memory, it must be fully within some @tdx_memlist entry, but cannot cross
multiple small entries. And the memory hotplug case in your above diagram
actually shouldn't matter.

If above stands, how about below?

/*
* This check assumes that the start_pfn<->end_pfn range does not
* cross multiple @tdx_memlist entries. A single memory online
* event across multiple @tdx_memlist entries (which are derived
* from memblocks at the time of module initialization) is not
* possible.
*
* This is because memory offline/online is done on granularity
* of 'struct memory_block', and the hotpluggable memory region
* (one memblock) must be multiple of memory_block. Also, the
* "to-be-online" memory_block must be full memory (no memory
* hole, i.e. containing multiple small memblocks).
*
* This means if the start_pfn<->end_pfn range is TDX memory, it
* must be fully within one @tdx_memlist entry, but cannot cross
* multiple small entries.
*/
list_for_each_entry(tmb, &tdx_memlist, list) {
if (start_pfn >= tmb->start_pfn && end_pfn <= tmb->end_pfn)
return true;
}