Re: [RFC PATCH 00/13] Core scheduling v5

From: Li, Aubrey
Date: Mon Jun 29 2020 - 17:26:00 EST


Hi Vineeth,

On 2020/6/26 4:12, Vineeth Remanan Pillai wrote:
> On Wed, Mar 4, 2020 at 12:00 PM vpillai <vpillai@xxxxxxxxxxxxxxxx> wrote:
>>
>>
>> Fifth iteration of the Core-Scheduling feature.
>>
> Its probably time for an iteration and We are planning to post v6 based
> on this branch:
> https://github.com/digitalocean/linux-coresched/tree/coresched/pre-v6-v5.7.y
>
> Just wanted to share the details about v6 here before posting the patch
> series. If there is no objection to the following, we shall be posting
> the v6 early next week.
>
> The main changes from v6 are the following:
> 1. Address Peter's comments in v5
> - Code cleanup
> - Remove fixes related to hotplugging.
> - Split the patch out for force idling starvation
> 3. Fix for RCU deadlock
> 4. core wide priority comparison minor re-work.
> 5. IRQ Pause patch
> 6. Documentation
> - https://github.com/digitalocean/linux-coresched/blob/coresched/pre-v6-v5.7.y/Documentation/admin-guide/hw-vuln/core-scheduling.rst
>
> This version is much leaner compared to v5 due to the removal of hotplug
> support. As a result, dynamic coresched enable/disable on cpus due to
> smt on/off on the core do not function anymore. I tried to reproduce the
> crashes during hotplug, but could not reproduce reliably. The plan is to
> try to reproduce the crashes with v6, and document each corner case for crashes
> as we fix those. Previously, we randomly fixed the issues without a clear
> documentation and the fixes became complex over time.
>
> TODO lists:
>
> - Interface discussions could not come to a conclusion in v5 and hence would
> like to restart the discussion and reach a consensus on it.
> - https://lwn.net/ml/linux-kernel/20200520222642.70679-1-joel@xxxxxxxxxxxxxxxxx
>
> - Core wide vruntime calculation needs rework:
> - https://lwn.net/ml/linux-kernel/20200506143506.GH5298@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>
> - Load balancing/migration changes ignores group weights:
> - https://lwn.net/ml/linux-kernel/20200225034438.GA617271@xxxxxxxxxxxxxxxxxxxxxxxxxxxx

According to Aaron's response below:
https://lwn.net/ml/linux-kernel/20200305085231.GA12108@xxxxxxxxxxxxxxxxxxxxxxxxxxxx/

The following logic seems to be helpful for Aaron's case.

+ /*
+ * Ignore cookie match if there is a big imbalance between the src rq
+ * and dst rq.
+ */
+ if ((src_rq->cfs.h_nr_running - rq->cfs.h_nr_running) > 1)
+ return true;

I didn't see any other comments on the patch at here:
https://lwn.net/ml/linux-kernel/67e46f79-51c2-5b69-71c6-133ec10b68c4@xxxxxxxxxxxxxxx/

Do we have another way to address this issue?

Thanks,
-Aubrey