Re: [PATCH] RDMA/siw: work around clang stack size warning

From: Arnd Bergmann
Date: Sat Jun 21 2025 - 04:43:49 EST


On Sat, Jun 21, 2025, at 06:12, Zhu Yanjun wrote:
> 在 2025/6/20 4:43, Arnd Bergmann 写道:
>
> Because the array of kvec structures in siw_tx_hdt consumes the majority
> of the stack space, would it be possible to use kmalloc or a similar
> dynamic memory allocation function instead of allocating this memory on
> the stack?
>
> Would using kmalloc (or an equivalent) also effectively resolve the
> stack usage issue?

Yes, moving the allocation somewhere else (kmalloc, static variable,
per siw_sge, per siw_wqe) would avoid the high stack usage effectively,
it's a tradeoff and I picked the solution that made the most sense
to me, but there is a good chance another alternative is better here.

The main differences are:

- kmalloc() adds runtime overhead that may be expensive in a
fast path

- kmalloc() can fail, which adds complexity from error handling.
Note that small allocations with GFP_KERNEL do not fail but instead
wait for memory to become available

- If kmalloc() runs into a low-memory situation, it can go through
writeback, which in turn can use more stack space than the
on-stack allocation it was replacing

- static allocations bloat the kernel image and require locking that
may be expensive

- per-object preallocations can be wasteful if a lot of objects
are created, and can still require locking if the object is used
from multiple threads

As I wrote, I mainly picked the 'noinline_for_stack' approach
here since that is how the code is known to work with gcc, so
there is little risk of my patch causing problems.

Moving the both the kvec array and the page array into
the siw_wqe is likely better here, I'm not familiar enough
with the driver to tell whether that is an overall improvement.

A related change I would like to see is to remove the
kmap_local_page() in this driver and instead make it
depend on 64BIT or !CONFIG_HIGHMEM, to slowly chip away
at the code that is highmem aware throughout the kernel.
I'm not sure if that that would also help drop the array
here.

Arnd