[PATCH 1/4] smp: Improve locality in smp_call_function_any()
From: Yury Norov
Date: Fri Jun 06 2025 - 16:27:54 EST
From: "Yury Norov [NVIDIA]" <yury.norov@xxxxxxxxx>
smp_call_function_any() tries to make a local call as it's the cheapest
option, or switches to a CPU in the same node. If it's not possible, the
algorithm gives up and searches for any CPU, in a numerical order.
Instead, we can search for the best CPU based on NUMA locality, including
2nd nearest hop (a set of equidistant nodes), and higher.
sched_numa_find_nth_cpu() does exactly that, and also helps to drop most
of housekeeping code.
Signed-off-by: Yury Norov [NVIDIA] <yury.norov@xxxxxxxxx>
---
kernel/smp.c | 19 +++----------------
1 file changed, 3 insertions(+), 16 deletions(-)
diff --git a/kernel/smp.c b/kernel/smp.c
index 974f3a3962e8..7c8cfab0ce55 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -741,32 +741,19 @@ EXPORT_SYMBOL_GPL(smp_call_function_single_async);
*
* Selection preference:
* 1) current cpu if in @mask
- * 2) any cpu of current node if in @mask
- * 3) any other online cpu in @mask
+ * 2) nearest cpu in @mask, based on NUMA topology
*/
int smp_call_function_any(const struct cpumask *mask,
smp_call_func_t func, void *info, int wait)
{
unsigned int cpu;
- const struct cpumask *nodemask;
int ret;
/* Try for same CPU (cheapest) */
cpu = get_cpu();
- if (cpumask_test_cpu(cpu, mask))
- goto call;
-
- /* Try for same node. */
- nodemask = cpumask_of_node(cpu_to_node(cpu));
- for (cpu = cpumask_first_and(nodemask, mask); cpu < nr_cpu_ids;
- cpu = cpumask_next_and(cpu, nodemask, mask)) {
- if (cpu_online(cpu))
- goto call;
- }
+ if (!cpumask_test_cpu(cpu, mask))
+ cpu = sched_numa_find_nth_cpu(mask, 0, cpu_to_node(cpu));
- /* Any online will do: smp_call_function_single handles nr_cpu_ids. */
- cpu = cpumask_any_and(mask, cpu_online_mask);
-call:
ret = smp_call_function_single(cpu, func, info, wait);
put_cpu();
return ret;
--
2.43.0