Re: [PATCH] sched/fair: don't assign runtime for throttled cfs_rq

From: Valentin Schneider
Date: Mon Aug 19 2019 - 13:34:18 EST


On 16/08/2019 18:19, Valentin Schneider wrote:
[...]
> Yeah it's probably pretty stupid. IIRC throttled cfs_rq means frozen
> rq_clock, so any subsequent call to update_curr() on a throttled cfs_rq
> should lead to an early bailout anyway due to delta_exec <= 0.
>

Did some more tracing, seems like the issue is we can make
->runtime_remaining positive in assign_cfs_rq_runtime() but not mark the
cfs_rq as unthrottled.

So AFAICT we'd need something like this:

-----8<-----
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 1054d2cf6aaa..ffbb4dfc4b81 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4385,6 +4385,11 @@ static inline u64 cfs_rq_clock_task(struct cfs_rq *cfs_rq)
return rq_clock_task(rq_of(cfs_rq)) - cfs_rq->throttled_clock_task_time;
}

+static inline int cfs_rq_throttled(struct cfs_rq *cfs_rq)
+{
+ return cfs_bandwidth_used() && cfs_rq->throttled;
+}
+
/* returns 0 on failure to allocate runtime */
static int assign_cfs_rq_runtime(struct cfs_rq *cfs_rq)
{
@@ -4411,6 +4416,9 @@ static int assign_cfs_rq_runtime(struct cfs_rq *cfs_rq)

cfs_rq->runtime_remaining += amount;

+ if (cfs_rq->runtime_remaining > 0 && cfs_rq_throttled(cfs_rq))
+ unthrottle_cfs_rq(cfs_rq);
+
return cfs_rq->runtime_remaining > 0;
}

@@ -4439,11 +4447,6 @@ void account_cfs_rq_runtime(struct cfs_rq *cfs_rq, u64 delta_exec)
__account_cfs_rq_runtime(cfs_rq, delta_exec);
}

-static inline int cfs_rq_throttled(struct cfs_rq *cfs_rq)
-{
- return cfs_bandwidth_used() && cfs_rq->throttled;
-}
-
/* check whether cfs_rq, or any parent, is throttled */
static inline int throttled_hierarchy(struct cfs_rq *cfs_rq)
{
----->8-----

Does that make sense? If so we *may* want to add some ->runtime_remaining
wrappers (e.g. {add/remove}_runtime()) and have the check in there to
make sure it's not forgotten.