Re: [RFC PATCH] sched/fair: Interleave cfs bandwidth timers for improved single thread performance at low utilization

From: Benjamin Segall
Date: Wed Feb 15 2023 - 16:32:22 EST


shrikanth hegde <sshegde@xxxxxxxxxxxxxxxxxx> writes:

>>>
>>> 6.2.rc5 with patch
>>> 1CG power 2CG power | 1CG power 2CG power
>>> 1Core 218 44 315 46 | 219 45 277(+12%) 47(-2%)
>>> 219 43 315 45 | 219 44 244(+22%) 48(-6%)
>>> |
>>> 2Core 108 48 158 52 | 109 50 114(+26%) 59(-13%)
>>> 109 49 157 52 | 109 49 136(+13%) 56(-7%)
>>> |
>>> 4Core 60 59 89 65 | 62 58 72(+19%) 68(-5%)
>>> 61 61 90 65 | 62 60 68(+24%) 73(-12%)
>>> |
>>> 8Core 33 77 48 83 | 33 77 37(+23%) 91(-10%)
>>> 33 77 48 84 | 33 77 38(+21%) 90(-7%)
>>>
>>> There is no benefit at higher utilization of 50% or more. There is no
>>> degradation also.
>>>
>>> This is RFC PATCH V2, where the code has been shifted from hrtimer to
>>> sched. This patch sets an initial value as multiple of period/10.
>>> Here timers can still align if the time started the cgroup is within the
>>> period/10 interval. On a real life workload, time gives sufficient
>>> randomness. There can be a better interleaving by being more
>>> deterministic. For example, when there are 2 cgroups, they should
>>> have initial value of 0/50ms or 10/60ms so on. When there are 3 cgroups,
>>> 0/3/6ms or 1/4/7ms etc. That is more complicated as it has to account
>>> for cgroup addition/deletion and accuracy w.r.t to period/quota.
>>> If that approach is better here, then will come up with that patch.
>>
>> This does seem vaguely reasonable, though the power argument of
>> consolidating wakeups and such is something that we intentionally do in
>> other situations.
>>
> Thank you Benjamin for taking a look and spending time in reviewing this.
>> How reasonable do you think it is to just say (and what do the
>> equivalent numbers look like on your particular benchmark) "put some
>> variance on your period config if you want variance"?
>>Run to run variance is expected with this patch as the patch depends
> on time upto last period/10 as the basis for interleaving.
> What i could infer from this comment about variance. Please correct if not.

My question is what the numbers look like if you instead prepare the
cgroups with periods that are something like 97 ms and 103ms instead of
both 100ms (keeping the quota as the same proportion as the original).