Re: [PATCH v4 1/4] sched/fair: Introduce primitives for CFS bandwidth burst

From: changhuaixin
Date: Mon Apr 19 2021 - 04:19:05 EST




> On Mar 18, 2021, at 11:10 PM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Thu, Mar 18, 2021 at 08:59:44AM -0400, Phil Auld wrote:
>> I admit to not having followed all the history of this patch set. That
>> said, when I see the above I just think your quota is too low for your
>> workload.
>
> This.
>
>> The burst (mis?)feature seems to be a way to bypass the quota. And it
>> sort of assumes cooperative containers that will only burst when they
>> need it and then go back to normal.
>
> Its not entirely unreasonable or unheard of. There's soft realtime
> systems that use this to increase the utilization with the trade-off
> that you're going to miss deadlines once every so often.
>
> If you do it right, you can calculate the probabilities. Or usually the
> other way around, you calculate the allowed variance/burst given a P
> value for making the deadline.
>
> Input then isn't the WCET for each task, but a runtime distribution as
> measured for your workload on your system etc..
>
> I used to have papers on this, but I can't seem to find them in a hurry.
>

Hi, I have done some reading on queueing theory and done some problem definition.

Divide real time into discrete periods as cfs_b does. Assume there are m cgroups using
CFS Bandwidth Control. During each period, the i-th cgroup demands u_i CPU time,
where we assume u_i is under some distribution(exponential, Poisson or else).
At the end of a period, if the sum of u_i is under or equal to 100%, we call it an "idle" state.
The number of periods between two "idle" states stands for the WCET of tasks during these
periods.

Originally using quota, it is guaranteed that "idle" state comes at the end of each period. Thus,
the WCET is the length of period. When enabling CPU Burst, the sum of u_i may exceed 100%,
and the exceeded workload is handled in the following periods. The WCET is the number of periods
between two "idle" states.

Then, we are going to calculate the probabilities that WCET is longer than a period, and the average
WCET when using certain burst under some runtime distribution.

Basically, these are based on pervious mails. I am sending this email to see if there is anything wrong
on problem definition.