Re: [PATCH 8/9] sched/fair: Optimize cgroup pick_next_task_fair

From: bsegall
Date: Tue Jan 21 2014 - 16:51:24 EST


Peter Zijlstra <peterz@xxxxxxxxxxxxx> writes:

> On Tue, Jan 21, 2014 at 12:03:38PM -0800, bsegall@xxxxxxxxxx wrote:
>
>> > Indeed, in that case we can miss something... we could try to call
>> > check_cfs_rq_runtime() from the initial top-down selection loop? When
>> > true, just put the entire stack and don't pretend to be smart?
>>
>> Yeah, I think that should work. I wasn't sure if there could be a
>> problem of doing throttle_cfs_rq(parent); throttle_cfs_rq(child);, but
>> thinking about, that has to be ok, because schedule can do that if
>> deactivate throttled the parent, schedule calls update_rq_clock, and
>> then put_prev throttled the child.
>
> Maybe something like the below? Completely untested and everything.
>
> There's the obviuos XXX fail that was also in the previous version; not
> sure how to deal with that yet, either we need to change the interface
> to take struct task_struct **prev, or get smarter :-)
>
> Any other obvious fails in here?

prev can be NULL to start with, hrtick should be handled in both paths.
How about this on top of your patch (equally untested) to fix those and
the XXX? The double-check on nr_running is annoying, but necessary when
prev slept.

---
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4528,14 +4528,8 @@ pick_next_task_fair(struct rq *rq, struct task_struct *prev)
if (!cfs_rq->nr_running)
return NULL;

- if (!IS_ENABLED(CONFIG_FAIR_GROUP_SCHED) ||
- (prev->sched_class != &fair_sched_class)) {
- prev->sched_class->put_prev_task(rq, prev);
- prev = NULL;
- }
-
#ifdef CONFIG_FAIR_GROUP_SCHED
- if (!prev)
+ if (!prev || prev->sched_class != &fair_sched_class)
goto simple;

do {
@@ -4552,10 +4546,8 @@ pick_next_task_fair(struct rq *rq, struct task_struct *prev)
else
curr = NULL;

- if (unlikely(check_cfs_rq_runtime(cfs_rq))) {
- put_prev_task_fair(rq, prev);
+ if (unlikely(check_cfs_rq_runtime(cfs_rq)))
goto simple;
- }

se = pick_next_entity(cfs_rq, curr);
cfs_rq = group_cfs_rq(se);
@@ -4589,15 +4581,19 @@ pick_next_task_fair(struct rq *rq, struct task_struct *prev)
set_next_entity(cfs_rq, se);
}

+ if (hrtick_enabled(rq))
+ hrtick_start_fair(rq, p);
+
return p;
simple:
#endif
cfs_rq = &rq->cfs;

- if (!cfs_rq->nr_running) {
- // XXX FAIL we should never return NULL after putting @prev
+ if (!cfs_rq->nr_running)
return NULL;
- }
+
+ if (prev)
+ prev->sched_class->put_prev_task(rq, prev);

do {
se = pick_next_entity(cfs_rq, NULL);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/