Re: [PATCH v2] sched/cache: Reduce the overhead of task_cache_work by only scan the visisted cpus.
From: Chen, Yu C
Date: Mon Apr 20 2026 - 03:54:19 EST
On 4/18/2026 5:01 PM, Luo Gengkun wrote:
On 2026/4/15 11:10, Chen, Yu C wrote:
Hi Gengkun,
[ ... ]
Do we really need to access rq->cpu_epoch under the lock for read scenarios?@@ -1736,8 +1746,17 @@ static void task_cache_work(struct callback_head *work)
continue;
for_each_cpu(i, sched_domain_span(sd)) {
- occ = fraction_mm_sched(cpu_rq(i),
- per_cpu_ptr(mm->sc_stat.pcpu_sched, i));
+ struct rq *rq = cpu_rq(i);
+ struct sched_cache_time *pcpu_sched = per_cpu_ptr(mm->sc_stat.pcpu_sched, i);
+ /* Skip the rq that has not been hit for a long time */
+ if (sched_cache_timeout_enabled() &&
+ cpumask_test_cpu(cpu_of(rq), &mm- >sc_stat.visited_cpus) &&
cpumask_test_cpu(i) should be fine. The rq access above doesn't hold cpu_epoch_lock.
I wonder if we can safely calculate rq->cpu_epoch - pcpu_sched->epoch
inside fraction_mm_sched while holding the lock?
I noticed task_tick_cache accesses it directly. Plus, moving this access outside
the lock would help reduce lock contention.
Good question. task_tick_cache() access local rq->cpu_epoch with rq->lock held
and irq disabled, while task_cache_work() is running with irq enabled without
any rq->lock hold, and might not be run on local rq - see __exit_to_user_mode_loop(),
it checks _TIF_NEED_RESCHED before _TIF_NOTIFY_RESUME, so p could be switched out
and woken up and run task_cache_work() on a different CPU.
That is to say, I just wonder if there could be a race window
that bring inconsistency between two reads of rq->cpu_epoch - pcpu_sched->epoch
- not necessary a critical issue though.
thanks,
Chenyu