Re: [PATCH 2/4] swiotlb: Add a new cc-swiotlb implementation for Confidential VMs
From: Guorui Yu
Date: Tue Jan 31 2023 - 21:08:59 EST
在 2023/2/1 01:16, Andi Kleen 写道:
>No, this cannot guarantee we always have sufficient TLB caches, so we
can also have a "No memory for cc-swiotlb buffer" warning.
It's not just a warning, it will be IO errors, right?
Yes, they are IO errors, but unsustainable such IO errors are not fatal
in my limited testing so far, and the system can survive after through
them. Again, legacy swiotlb occasionally suffers from TLB starvation.
However, if dynamic allocation of TLB is not allowed at all, the system
will be more likely to be overwhelmed by a large of bursting IOs and
unable to respond. Such problems are generally transient, so it is
difficult to reproduce and debug in a production environment. Users can
only set an unreasonably large fixed size and REBOOT to mitigate this
problem as much as possible.
But I want to emphasize that in this case, the current implementation
is no worse than the legacy implementation. Moreover, dynamic TLB
allocation is more suitable for situations where more disks/network
devices will be hotplugged, in which case you cannot pre-set a
reasonable value.
That's a reasonable stand point, but have to emphasize that is
"probabilistic" in all the descriptions and comments.
Agreed, but one point to add is that the user can adjust the water level
setting to reduce the possibility of interrupt context allocation TLB
failure.
According to the current design, the kthread will be awaken to allocate
new TLBs when it is lower than half of the water level, so more flexible
room can be left by increasing the water level.
I assume you did some stress testing (E.g. all cores submitting at full
bandwidth) to validate that it works for you?
-Andi
Yes, I tested by fio with different block sizes, iodepths and job
numbers on my testbed.
And I have noticed that there are some "IO errors" of `No memory for
cc-swiotlb buffer` in the beginning of the test, but it will be
eventually disappeared as long as there are enough free memory.
Thanks for your time,
Guorui