On Thu, 2025-04-24 at 13:58 +0100, Robin Murphy wrote:
On 24/04/2025 6:12 am, Li, Hua Qian wrote:
On Tue, 2025-04-22 at 15:36 +0200, Marek Szyprowski wrote:
On 22.04.2025 08:37, huaqian.li@xxxxxxxxxxx wrote:Thank you for your feedback, Marek.
From: Li Hua Qian <huaqian.li@xxxxxxxxxxx>
This patchset introduces a change to make the IO_TLB_SEGSIZE
parameter
configurable via a new kernel configuration option
(CONFIG_SWIOTLB_SEGSIZE).
In certain applications, the default value of IO_TLB_SEGSIZE
(128)
may
not be sufficient for memory allocation, leading to runtime
errors.
By
making this parameter configurable, users can adjust the
segment
size to
better suit their specific use cases, improving flexibility and
system
stability.
Could You elaborate a bit more what are those certain
applications
that
require increasing IO_TLB_SEGSIZE? I'm not against it, but such
change
should be well justified and described, while the above cover-
letter
doesn't provide anything more than is written in the patch
description.
To provide more context, one specific application that requires
increasing IO_TLB_SEGSIZE is the Hailo 8 PCIe AI card. This card
uses
dma_alloc_coherent to allocate descriptor lists, as seen in the
Hailo
driver implementation here:
https://github.com/hailo-ai/hailort-drivers/blob/7161f9ee5918029bd4497f590003c2f87ec32507/linux/vdma/memory.c#L322
The maximum size (nslots) for these allocations can reach 160,
which
exceeds the current default value of IO_TLB_SEGSIZE (128).
Since IO_TLB_SEGSIZE is defined as a constant in the kernel:
`#define IO_TLB_SEGSIZE 128`
this limitation causes swiotlb_search_pool_area,
https://github.com/torvalds/linux/blame/v6.15-rc2/kernel/dma/swiotlb.c#L1085
,
(or swiotlb_do_find_slots in older kernels) to fail when attempting
to
allocate contiguous physical memory (CMA). This results in runtime
errors and prevents the Hailo 8 card from functioning correctly in
certain configurations.
Hmm, dma_alloc_coherent() should really not be trying to allocate
from
SWIOTLB in the first place - how is that happening?
If you're using restricted DMA for a device which wants significant
coherent allocations, then it wants to have it's own shared-dma-pool
for
those *as well* as the restricted-dma-pool for bouncing streaming
DMA.
Thanks,
Robin.
Hi Robin,
Regarding the specific Hailo Card case, the issue arises due
to the capabilities of certain SoCs or CPUs. For example, many
K3 SoCs lack an IOMMU, which is typically used to isolate the
system against DMA-based attacks of external PCI devices.
Taking the TI AM65 as an example, it doesn't have an IOMMU, but
instead includes a Peripheral Virtualization Unit (PVU). The
PVU provides functionality similar to an IOMMU and is used to
isolate PCI devices from the Linux host, and the SWIOTLB is
used to manp all DMA buffers from a static memory carve-out.
You can find more details and background information here:
https://lore.kernel.org/all/20250422061406.112539-1-huaqian.li@xxxxxxxxxxx/
By making IO_TLB_SEGSIZE configurable via a kernel configuration
option
(CONFIG_SWIOTLB_SEGSIZE), users can adjust the segment size to
accommodate such use cases. This change improves flexibility and
ensures that systems can be tailored to meet the requirements of
specific hardware, such as the Hailo 8 PCIe AI card, without
requiring
kernel source modifications.
I hope this example clarifies the need for this change. Please let
me
know if further details or additional examples are required.
Best Regards,
Li Hua Qian
Li Hua Qian (1):Best regards
swiotlb: Make IO_TLB_SEGSIZE configurable
include/linux/swiotlb.h | 2 +-
kernel/dma/Kconfig | 7 +++++++
2 files changed, 8 insertions(+), 1 deletion(-)
--
Hua Qian Li
Siemens AG
http://www.siemens.com/