[PATCH v9 2/2] sched: Annotate sched_clock_irqtime with __read_mostly

From: Yafang Shao
Date: Sat May 10 2025 - 23:09:04 EST


Eric reported an issue [0] as follows,
: rebalance_domains() can attempt to change sched_balance_running
: more than 350,000 times per second on our servers.

: If sched_clock_irqtime and sched_balance_running share the
: same cache line, we see a very high cost on hosts with 480 threads
: dealing with many interrupts.

While the rebalance_domains() issue has been resolved [1], we should
proactively annotate sched_clock_irqtime with __read_mostly to prevent
potential cacheline false sharing. This optimization is particularly
justified since sched_clock_irqtime is only modified during TSC instability
events.

Link: https://lore.kernel.org/all/20250423174634.3009657-1-edumazet@xxxxxxxxxx/ [0]
Link: https://lore.kernel.org/all/20250416035823.1846307-1-tim.c.chen@xxxxxxxxxxxxxxx/ [1]

Reported-by: Eric Dumazet <edumazet@xxxxxxxxxx>
Debugged-by: Eric Dumazet <edumazet@xxxxxxxxxx>
Signed-off-by: Yafang Shao <laoar.shao@xxxxxxxxx>
Cc: Eric Dumazet <edumazet@xxxxxxxxxx>
---
kernel/sched/cputime.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 6dab4854c6c0..c499a42ceda4 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -22,7 +22,7 @@
*/
DEFINE_PER_CPU(struct irqtime, cpu_irqtime);

-int sched_clock_irqtime;
+int __read_mostly sched_clock_irqtime;

void enable_sched_clock_irqtime(void)
{
--
2.43.5