Re: [PATCH 2/4] swiotlb: Add a new cc-swiotlb implementation for Confidential VMs
From: Guorui Yu
Date: Mon Jan 30 2023 - 08:45:31 EST
Hi Andi,
在 2023/1/30 14:46, Andi Kleen 写道:
I try to solve this problem by creating a new kernel thread, "kccd",
to populate the TLB buffer in the backgroud.
Specifically,
1. A new kernel thread is created with the help of "arch_initcall",
and this kthread is responsible for memory allocation and setting
memory attributes (private or shared);
2. The "swiotlb_tbl_map_single" routine only use the spin_lock
protected TLB buffers pre-allocated by the kthread;
a) which actually includes ONE memory allocation brought by xarray
insertion "__xa_insert__".
That already seems dangerous with all the usual problems of memory
allocations in IO paths. Normally code at least uses a mempool to avoid
the worst dead lock potential.
The __xa_insert__ is called with GFP_NOWAIT (GFP_ATOMIC & ~__GFP_HIGH),
and I will try to dig into this to check if there is any chance to have
the deadlock.
I also tried my best to test this piece of code, and no issue have been
found in the case of a maximum of 700,000 IOPS.
Thanks for your advices from this point, since I have not notice such
possibility.
3. After each allocation, the water level of TLB resources will be
checked. If the current TLB resources are found to be lower than the
preset value (half of the watermark), the kthread will be awakened to
fill them.
4. The TLB buffer allocation in the kthread is batched to
"(MAX_ORDER_NR_PAGES << PAGE_SHIFT)" to reduce the holding time of
spin_lock and number of calls to set_memory_decrypted().
Okay, but does this guarantee that it will never run out of memory?
It seems difficult to make such guarantees. What happens for example if
the background thread gets starved by something higher priority?
No, this cannot guarantee we always have sufficient TLB caches, so we
can also have a "No memory for cc-swiotlb buffer" warning.
But I want to emphasize that in this case, the current implementation is
no worse than the legacy implementation. Moreover, dynamic TLB
allocation is more suitable for situations where more disks/network
devices will be hotplugged, in which case you cannot pre-set a
reasonable value.
Or if the allocators have such high bandwidth that they can overwhelm
any reasonable background thread.
-Andi
Sincerely,
Guorui