Re: [PATCH v4 04/10] sched/fair: Dynamically update cfs_overload_cpus

From: Steven Sistare
Date: Fri Dec 07 2018 - 17:36:24 EST


On 12/7/2018 3:20 PM, Valentin Schneider wrote:
> Hi Steve,
>
> On 06/12/2018 21:28, Steve Sistare wrote:
> [...]
>> @@ -3724,6 +3725,28 @@ static inline void update_misfit_status(struct task_struct *p, struct rq *rq)
>> rq->misfit_task_load = task_h_load(p);
>> }
>>
>> +static void overload_clear(struct rq *rq)
>
> Nitpicky nit: cfs_overload_{clear, set} might be a bit better, just to
> explicitly differentiate this from rq->rd->overload. Although I suppose
> the naming problem will show up again if/when you try to expand this to
> other classes...

This is static within fair.c which is CFS, so I think the name is OK.

>> +{
>> + struct sparsemask *overload_cpus;
>> +
>> + rcu_read_lock();
>> + overload_cpus = rcu_dereference(rq->cfs_overload_cpus);
>> + if (overload_cpus)
>> + sparsemask_clear_elem(overload_cpus, rq->cpu);
>> + rcu_read_unlock();
>> +}
>> +
>> +static void overload_set(struct rq *rq)
>> +{
>> + struct sparsemask *overload_cpus;
>> +
>> + rcu_read_lock();
>> + overload_cpus = rcu_dereference(rq->cfs_overload_cpus);
>> + if (overload_cpus)
>> + sparsemask_set_elem(overload_cpus, rq->cpu);
>> + rcu_read_unlock();
>> +}
>> +
>> #else /* CONFIG_SMP */
>>
>> #define UPDATE_TG 0x0
> [...]
>> @@ -4468,8 +4495,12 @@ static void throttle_cfs_rq(struct cfs_rq *cfs_rq)
>> dequeue = 0;
>> }
>>
>> - if (!se)
>> + if (!se) {
>> sub_nr_running(rq, task_delta);
>> + if (prev_nr >= 2 && prev_nr - task_delta < 2)
>> + overload_clear(rq);
>> +
>> + }
>
> Eventually it'd be nice to squash those into {add, sub}_nr_running(), but
> you already mentioned wanting to stick to CFS for now, so I don't think
> it's *too* much of a big deal.

Maybe. It depends on a design decision to be made if/when we add bitmap
based stealing to other scheduling classes. Do we maintain one bitmap
for overloaded CPUs where the overload may be caused by any mix of different
task classes? If yes, then the bitmap search for one class such as RT
will inspect and reject overloaded CPUs that only have CFS tasks, which
making the search less efficient. I am leaning towards a separate bitmap
per class to avoid that.

- Steve