Re: [RFC PATCH] sched/fair: Interleave cfs bandwidth timers for improved single thread performance at low utilization

From: Peter Zijlstra
Date: Mon Feb 20 2023 - 12:38:22 EST


On Tue, Feb 14, 2023 at 08:54:09PM +0530, shrikanth hegde wrote:

> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index ff4dbbae3b10..7b69c329e05d 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -5939,14 +5939,25 @@ static void init_cfs_rq_runtime(struct cfs_rq *cfs_rq)
>
> void start_cfs_bandwidth(struct cfs_bandwidth *cfs_b)
> {
> - lockdep_assert_held(&cfs_b->lock);
> + struct hrtimer *period_timer = &cfs_b->period_timer;
> + s64 incr = ktime_to_ns(cfs_b->period) / 10;
> + ktime_t delta;
> + u64 orun = 1;
>
> + lockdep_assert_held(&cfs_b->lock);
> if (cfs_b->period_active)
> return;
>
> cfs_b->period_active = 1;
> - hrtimer_forward_now(&cfs_b->period_timer, cfs_b->period);
> - hrtimer_start_expires(&cfs_b->period_timer, HRTIMER_MODE_ABS_PINNED);
> + delta = ktime_sub(period_timer->base->get_time(),
> + hrtimer_get_expires(period_timer));
> + if (unlikely(delta >= cfs_b->period)) {
> + orun = ktime_divns(delta, incr);
> + hrtimer_add_expires_ns(period_timer, incr * orun);
> + }
> +
> + hrtimer_forward_now(period_timer, cfs_b->period);
> + hrtimer_start_expires(period_timer, HRTIMER_MODE_ABS_PINNED);
> }

What kind of mad hackery is this? Why can't you do the sane thing and
initialize the timer at !0 in init_cfs_bandwidth(), then any of the
forwards will stay in period -- as they should.

Please, go re-read Thomas's email.

*completely* untested...

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 7c46485d65d7..4d6ea76096dc 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5915,6 +5915,7 @@ void init_cfs_bandwidth(struct cfs_bandwidth *cfs_b)

INIT_LIST_HEAD(&cfs_b->throttled_cfs_rq);
hrtimer_init(&cfs_b->period_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_PINNED);
+ cfs_b->period_timer.node.expires = get_random_u32_below(cfs_b->period);
cfs_b->period_timer.function = sched_cfs_period_timer;
hrtimer_init(&cfs_b->slack_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
cfs_b->slack_timer.function = sched_cfs_slack_timer;