Re: [PATCH] sched/stats: TASK_IDLE task bypass the block_starts time

From: Peter Zijlstra
Date: Fri Jun 20 2025 - 04:55:36 EST


On Fri, Jun 20, 2025 at 11:14:50AM +0800, Olice Zou wrote:
> For TASK_IDLE task, we not should record the block_starts, it is
> not real TASK_UNINTERRUPTIBLE task.

Why, I mean it is still blocked, right?

> It is easy to find this problem in a idle machine as followe:
>
> bpftrace -e 'tracepoint:sched:sched_stat_blocked { \
> if (args->delay > 1000000) \
> { \
> printf("%s %d\n", args->comm, args->delay); \
> print(kstack()); \
> } \
> }
>
> rcu_preempt 3881764
> __update_stats_enqueue_sleeper+604
> __update_stats_enqueue_sleeper+604
> enqueue_entity+1014
> enqueue_task_fair+156
> activate_task+109
> ttwu_do_activate+111
> try_to_wake_up+615
> wake_up_process+25
> process_timeout+22
> call_timer_fn+44
> run_timer_softirq+1100
> handle_softirqs+178
> irq_exit_rcu+113
> sysvec_apic_timer_interrupt+132
> asm_sysvec_apic_timer_interrupt+31
> pv_native_safe_halt+15
> arch_cpu_idle+13
> default_idle_call+48
> do_idle+516
> cpu_startup_entry+49
> start_secondary+280
> secondary_startup_64_no_verify+404

Not sure what I'm looking at there. What is the problem?

> Signed-off-by: Olice Zou <olicezou@xxxxxxxxxxx>
> ---
> kernel/sched/fair.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index a85539df75a5..e473e3244dda 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -1285,7 +1285,7 @@ update_stats_dequeue_fair(struct cfs_rq *cfs_rq, struct sched_entity *se, int fl
> if (state & TASK_INTERRUPTIBLE)
> __schedstat_set(tsk->stats.sleep_start,
> rq_clock(rq_of(cfs_rq)));
> - if (state & TASK_UNINTERRUPTIBLE)
> + if (state != TASK_IDLE && (state & TASK_UNINTERRUPTIBLE))
> __schedstat_set(tsk->stats.block_start,
> rq_clock(rq_of(cfs_rq)));
> }
> --
> 2.25.1
>