The existing IO page fault handler locates the PCI device by calling
pci_get_domain_bus_and_slot(), which searches the list of all PCI
devices until the desired PCI device is found. This is inefficient
because the algorithm efficiency of searching a list is O(n). In the
critical path of handling an IO page fault, this is not performance
friendly given that I/O page fault handling patch is performance
critical, and parallel heavy dsa_test may cause cpu stuck due to
the low efficiency and lock competition in current path.
To improve the performance of the IO page fault handler, replace
pci_get_domain_bus_and_slot() with a local red-black tree. A red-black
tree is a self-balancing binary search tree, which means that the
average time complexity of searching a red-black tree is O(log(n)). This
is significantly faster than O(n), so it can significantly improve the
performance of the IO page fault handler.
In addition, we can only insert the affected devices (those that have IO
page fault enabled) into the red-black tree. This can further improve
the performance of the IO page fault handler.
This series depends on "deliver page faults to user space" patch-set:
https://lore.kernel.org/linux-iommu/20230928042734.16134-1-baolu.lu@xxxxxxxxxxxxxxx/
Signed-off-by: Huang Jiaqing <jiaqing.huang@xxxxxxxxx>
---
drivers/iommu/io-pgfault.c | 104 ++++++++++++++++++++++++++++++++++++-
include/linux/iommu.h | 16 ++++++
2 files changed, 118 insertions(+), 2 deletions(-)