Re: [PATCH 3/4] nohz: add tick_nohz_full_clear_cpus() API

From: Chris Metcalf
Date: Mon Mar 30 2015 - 12:46:04 EST


On 03/30/2015 12:41 PM, Rik van Riel wrote:
On 03/30/2015 12:20 PM, Chris Metcalf wrote:
I wanted to ping the patch below again, since I haven't heard any
feedback.

I note that Rik van Riel's change posted this weekend offers similar
functionality for userspace. My change offers a convenient API
for, e.g., kernel drivers setting up default irq balancing.

https://lkml.org/lkml/2015/3/28/94
I submitted a patch to irqbalance to exclude nohz_full
cpus from having irqs assigned to them. I could see
the same thing being useful for in-kernel irq assignment,
especially for multi-queue devices that set up irqs on
multiple CPUs.

An alternate API would be one that just returned the full no_hz
cpumask to kernel callers; I'd be happy with that as well, but my
instinct was to make the API as narrow as possible to start with.

Comments?
What drivers and subsystems are you targeting?

At the moment, just the on-chip tilegx ethernet controller
(drivers/net/ethernet/tile/tilegx.c) but I'm pushing to upstream
a number of other commits from the Tilera "dataplane" mode,
some of which include code that acts on cpumasks as well.

I am just looking at blk-mq now, and it seems like the
API most appropriate for that would be an inline function
that tests whether or not a CPU is nohz_full.

That API already exists - tick_nohz_full_cpu().

for_each_possible_cpu(i) {

...

if (cpu_nohz_full(i))
continue;

}

A lot of the other code in drivers and subsystems that
set up per-cpu queues and irqs seem to iterate over all
CPUs at init time, and could benefit from a function
allowing them to skip nohz_full CPUs.

Your tick_nohz_full_clear_cpus() function seems reasonable
too, for code that uses a cpumask to set up per cpu stuff.

I'm happy to ask for a pull request for the tile architecture
that includes that commit, if no one objects. I'd be happier
if someone acked the patch more explicitly, though.

Thanks!

From: Chris Metcalf <cmetcalf@xxxxxxxxxx>

This is useful, for example, to modify a cpumask to avoid the
nohz cores so that interrupts aren't sent to them.

Signed-off-by: Chris Metcalf <cmetcalf@xxxxxxxxxx>
---
Motivated by patch 4/4 in this series.

include/linux/tick.h | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/include/linux/tick.h b/include/linux/tick.h
index 9c085dc12ae9..d53ad4892a39 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -186,6 +186,12 @@ static inline bool tick_nohz_full_cpu(int cpu)
return cpumask_test_cpu(cpu, tick_nohz_full_mask);
}
+static inline void tick_nohz_full_clear_cpus(struct cpumask *mask)
+{
+ if (tick_nohz_full_enabled())
+ cpumask_andnot(mask, mask, tick_nohz_full_mask);
+}
+
extern void __tick_nohz_full_check(void);
extern void tick_nohz_full_kick(void);
extern void tick_nohz_full_kick_cpu(int cpu);
@@ -194,6 +200,7 @@ extern void __tick_nohz_task_switch(struct
task_struct *tsk);
#else
static inline bool tick_nohz_full_enabled(void) { return false; }
static inline bool tick_nohz_full_cpu(int cpu) { return false; }
+static inline void tick_nohz_full_clear_cpus(struct cpumask *mask) { }
static inline void __tick_nohz_full_check(void) { }
static inline void tick_nohz_full_kick_cpu(int cpu) { }
static inline void tick_nohz_full_kick(void) { }

--
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/