Re: [PATCH 2/4] swiotlb: Add a new cc-swiotlb implementation for Confidential VMs

From: Guorui Yu
Date: Tue Jan 31 2023 - 21:08:59 EST




在 2023/2/1 01:16, Andi Kleen 写道:
>No, this cannot guarantee we always have sufficient TLB caches, so we can also have a "No memory for cc-swiotlb buffer" warning.

It's not just a warning, it will be IO errors, right?


Yes, they are IO errors, but unsustainable such IO errors are not fatal in my limited testing so far, and the system can survive after through them. Again, legacy swiotlb occasionally suffers from TLB starvation.

However, if dynamic allocation of TLB is not allowed at all, the system will be more likely to be overwhelmed by a large of bursting IOs and unable to respond. Such problems are generally transient, so it is difficult to reproduce and debug in a production environment. Users can only set an unreasonably large fixed size and REBOOT to mitigate this problem as much as possible.


But I want to emphasize that in this case, the current implementation is no worse than the legacy implementation. Moreover, dynamic TLB allocation is more suitable for situations where more disks/network devices will be hotplugged, in which case you cannot pre-set a reasonable value.

That's a reasonable stand point, but have to emphasize that is "probabilistic" in all the descriptions and comments.


Agreed, but one point to add is that the user can adjust the water level setting to reduce the possibility of interrupt context allocation TLB failure.

According to the current design, the kthread will be awaken to allocate new TLBs when it is lower than half of the water level, so more flexible room can be left by increasing the water level.

I assume you did some stress testing (E.g. all cores submitting at full bandwidth) to validate that it works for you?

-Andi


Yes, I tested by fio with different block sizes, iodepths and job numbers on my testbed.

And I have noticed that there are some "IO errors" of `No memory for cc-swiotlb buffer` in the beginning of the test, but it will be eventually disappeared as long as there are enough free memory.

Thanks for your time,
Guorui