Re: [Patch v4 1/3] lib: Restrict cpumask_local_spread to houskeeping CPUs

From: Robin Murphy
Date: Wed Jan 27 2021 - 08:51:46 EST


On 2021-01-27 13:09, Marcelo Tosatti wrote:
On Wed, Jan 27, 2021 at 12:36:30PM +0000, Robin Murphy wrote:
On 2021-01-27 12:19, Marcelo Tosatti wrote:
On Wed, Jan 27, 2021 at 11:57:16AM +0000, Robin Murphy wrote:
Hi,

On 2020-06-25 23:34, Nitesh Narayan Lal wrote:
From: Alex Belits <abelits@xxxxxxxxxxx>

The current implementation of cpumask_local_spread() does not respect the
isolated CPUs, i.e., even if a CPU has been isolated for Real-Time task,
it will return it to the caller for pinning of its IRQ threads. Having
these unwanted IRQ threads on an isolated CPU adds up to a latency
overhead.

Restrict the CPUs that are returned for spreading IRQs only to the
available housekeeping CPUs.

Signed-off-by: Alex Belits <abelits@xxxxxxxxxxx>
Signed-off-by: Nitesh Narayan Lal <nitesh@xxxxxxxxxx>
---
lib/cpumask.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/lib/cpumask.c b/lib/cpumask.c
index fb22fb266f93..85da6ab4fbb5 100644
--- a/lib/cpumask.c
+++ b/lib/cpumask.c
@@ -6,6 +6,7 @@
#include <linux/export.h>
#include <linux/memblock.h>
#include <linux/numa.h>
+#include <linux/sched/isolation.h>
/**
* cpumask_next - get the next cpu in a cpumask
@@ -205,22 +206,27 @@ void __init free_bootmem_cpumask_var(cpumask_var_t mask)
*/
unsigned int cpumask_local_spread(unsigned int i, int node)
{
- int cpu;
+ int cpu, hk_flags;
+ const struct cpumask *mask;
+ hk_flags = HK_FLAG_DOMAIN | HK_FLAG_MANAGED_IRQ;
+ mask = housekeeping_cpumask(hk_flags);

AFAICS, this generally resolves to something based on cpu_possible_mask
rather than cpu_online_mask as before, so could now potentially return an
offline CPU. Was that an intentional change?

Robin,

AFAICS online CPUs should be filtered.

Apologies if I'm being thick, but can you explain how? In the case of
isolation being disabled or compiled out, housekeeping_cpumask() is
literally just "return cpu_possible_mask;". If we then iterate over that
with for_each_cpu() and just return the i'th possible CPU (e.g. in the
NUMA_NO_NODE case), what guarantees that CPU is actually online?

Robin.

Nothing, but that was the situation before 1abdfe706a579a702799fce465bceb9fb01d407c
as well.

True, if someone calls from a racy context then there's not much we can do to ensure that any CPU *remains* online after we initially observed it to be, but when it's called from somewhere safe like a cpuhp offline handler, then picking from cpu_online_mask *did* always do the right thing (by my interpretation), whereas picking from housekeeping_cpumask() might not.

This is why I decided to ask rather than just send a patch to fix what I think might be a bug - I have no objection if this *is* intended behaviour, other than suggesting we amend the "...selects an online CPU..." comment if that aspect was never meant to be relied upon.

Thanks,
Robin.


cpumask_local_spread() should probably be disabling CPU hotplug.

Thomas?


I was just looking at the current code since I had the rare presence of mind
to check if something suitable already existed before I start open-coding
"any online CPU, but local node preferred" logic for handling IRQ affinity
in a driver - cpumask_local_spread() appears to be almost what I want (if a
bit more heavyweight), if only it would actually guarantee an online CPU as
the kerneldoc claims :(

Robin.

/* Wrap: we always want a cpu. */
- i %= num_online_cpus();
+ i %= cpumask_weight(mask);
if (node == NUMA_NO_NODE) {
- for_each_cpu(cpu, cpu_online_mask)
+ for_each_cpu(cpu, mask) {
if (i-- == 0)
return cpu;
+ }
} else {
/* NUMA first. */
- for_each_cpu_and(cpu, cpumask_of_node(node), cpu_online_mask)
+ for_each_cpu_and(cpu, cpumask_of_node(node), mask) {
if (i-- == 0)
return cpu;
+ }
- for_each_cpu(cpu, cpu_online_mask) {
+ for_each_cpu(cpu, mask) {
/* Skip NUMA nodes, done above. */
if (cpumask_test_cpu(cpu, cpumask_of_node(node)))
continue;