Re: [PATCHv4 6/8] x86/mm: Provide helpers for unaccepted memory

From: Kirill A. Shutemov
Date: Wed Apr 13 2022 - 12:07:14 EST


On Fri, Apr 08, 2022 at 12:21:19PM -0700, Dave Hansen wrote:
> On 4/5/22 16:43, Kirill A. Shutemov wrote:
> > +void accept_memory(phys_addr_t start, phys_addr_t end)
> > +{
> > + unsigned long *unaccepted_memory;
> > + unsigned long flags;
> > + unsigned int rs, re;
> > +
> > + if (!boot_params.unaccepted_memory)
> > + return;
> > +
> > + unaccepted_memory = __va(boot_params.unaccepted_memory);
> > + rs = start / PMD_SIZE;
> > +
> > + spin_lock_irqsave(&unaccepted_memory_lock, flags);
> > + for_each_set_bitrange_from(rs, re, unaccepted_memory,
> > + DIV_ROUND_UP(end, PMD_SIZE)) {
> > + /* Platform-specific memory-acceptance call goes here */
> > + panic("Cannot accept memory");
> > + bitmap_clear(unaccepted_memory, rs, re - rs);
> > + }
> > + spin_unlock_irqrestore(&unaccepted_memory_lock, flags);
> > +}
>
> Just to reiterate: this is a global spinlock. It's disabling
> interrupts. "Platform-specific memory-acceptance call" will soon become:
>
> tdx_accept_memory(rs * PMD_SIZE, re * PMD_SIZE);
>
> which is a page-by-page __tdx_module_call():
>
> > + for (i = 0; i < (end - start) / PAGE_SIZE; i++) {
> > + if (__tdx_module_call(TDACCEPTPAGE, start + i * PAGE_SIZE,
> > + 0, 0, 0, NULL)) {
> > + error("Cannot accept memory: page accept failed\n");
> > + }
> > + }
>
> Each __tdx_module_call() involves a privilege transition that also
> surely includes things like changing CR3. It can't be cheap. It also
> is presumably touching the memory and probably flushing it out of the
> CPU caches. It's also unbounded:
>
> spin_lock_irqsave(&unaccepted_memory_lock, flags);
> for (i = 0; i < (end - start) / PAGE_SIZE; i++)
> // thousands? tens-of-thousands of cycles??
> spin_lock_irqsave(&unaccepted_memory_lock, flags);
>
> How far apart can end and start be? It's at *least* 2MB in the page
> allocator, which is on the order of a millisecond. Are we sure there
> aren't any callers that want to do this at a gigabyte granularity? That
> would hold the global lock and disable interrupts on the order of a second.

This codepath only gets invoked with orders <MAX_ORDER or 4M on x86-64.

> Do we want to bound the time that the lock can be held? Or, should we
> just let the lockup detectors tell us that we're being naughty?

Host can always DoS the guess, so yes this can lead to lockups.


--
Kirill A. Shutemov