Re: 4.3 group scheduling regression

From: Mike Galbraith
Date: Tue Oct 13 2015 - 00:08:43 EST


On Tue, 2015-10-13 at 03:55 +0800, Yuyang Du wrote:
> On Mon, Oct 12, 2015 at 12:23:31PM +0200, Mike Galbraith wrote:
> > On Mon, 2015-10-12 at 10:12 +0800, Yuyang Du wrote:
> >
> > > I am guessing it is in calc_tg_weight(), and naughty boys do make them more
> > > favored, what a reality...
> > >
> > > Mike, beg you test the following?
> >
> > Wow, that was quick. Dinky patch made it all better.
> >
> > -----------------------------------------------------------------------------------------------------------------
> > Task | Runtime ms | Switches | Average delay ms | Maximum delay ms | Maximum delay at |
> > -----------------------------------------------------------------------------------------------------------------
> > oink:(8) | 739056.970 ms | 27270 | avg: 2.043 ms | max: 29.105 ms | max at: 339.988310 s
> > mplayer:(25) | 36448.997 ms | 44670 | avg: 1.886 ms | max: 72.808 ms | max at: 302.153121 s
> > Xorg:988 | 13334.908 ms | 22210 | avg: 0.081 ms | max: 25.005 ms | max at: 269.068666 s
> > testo:(9) | 2558.540 ms | 13703 | avg: 0.124 ms | max: 6.412 ms | max at: 279.235272 s
> > konsole:1781 | 1084.316 ms | 1457 | avg: 0.006 ms | max: 1.039 ms | max at: 268.863379 s
> > kwin:1734 | 879.645 ms | 17855 | avg: 0.458 ms | max: 15.788 ms | max at: 268.854992 s
> > pulseaudio:1808 | 356.334 ms | 15023 | avg: 0.028 ms | max: 6.134 ms | max at: 324.479766 s
> > threaded-ml:3483 | 292.782 ms | 25769 | avg: 0.364 ms | max: 40.387 ms | max at: 294.550515 s
> > plasma-desktop:1745 | 265.055 ms | 1470 | avg: 0.102 ms | max: 21.886 ms | max at: 267.724902 s
> > perf:3439 | 61.677 ms | 2 | avg: 0.117 ms | max: 0.232 ms | max at: 367.043889 s
>
> Phew...
>
> I think maybe the real disease is the tg->load_avg is not updated in time.
> I.e., it is after migrate, the source cfs_rq does not decrease its contribution
> to the parent's tg->load_avg fast enough.

It sounded like you wanted me to run the below alone. If so, it's a nogo.

-----------------------------------------------------------------------------------------------------------------
Task | Runtime ms | Switches | Average delay ms | Maximum delay ms | Maximum delay at |
-----------------------------------------------------------------------------------------------------------------
oink:(8) | 787001.236 ms | 21641 | avg: 0.377 ms | max: 21.991 ms | max at: 51.504005 s
mplayer:(25) | 4256.224 ms | 7264 | avg: 19.698 ms | max: 2087.489 ms | max at: 115.294922 s
Xorg:1011 | 1507.958 ms | 4081 | avg: 8.349 ms | max: 1652.200 ms | max at: 126.908021 s
konsole:1752 | 697.806 ms | 1186 | avg: 5.749 ms | max: 160.189 ms | max at: 53.037952 s
testo:(9) | 438.164 ms | 2551 | avg: 6.616 ms | max: 215.527 ms | max at: 117.302455 s
plasma-desktop:1716 | 280.418 ms | 1624 | avg: 3.701 ms | max: 574.806 ms | max at: 53.582261 s
kwin:1708 | 144.986 ms | 2422 | avg: 3.301 ms | max: 315.707 ms | max at: 116.555721 s

> --
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 4df37a4..3dba883 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -2686,12 +2686,13 @@ static inline u64 cfs_rq_clock_task(struct cfs_rq *cfs_rq);
> static inline int update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq)
> {
> struct sched_avg *sa = &cfs_rq->avg;
> - int decayed;
> + int decayed, updated = 0;
>
> if (atomic_long_read(&cfs_rq->removed_load_avg)) {
> long r = atomic_long_xchg(&cfs_rq->removed_load_avg, 0);
> sa->load_avg = max_t(long, sa->load_avg - r, 0);
> sa->load_sum = max_t(s64, sa->load_sum - r * LOAD_AVG_MAX, 0);
> + updated = 1;
> }
>
> if (atomic_long_read(&cfs_rq->removed_util_avg)) {
> @@ -2708,7 +2709,7 @@ static inline int update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq)
> cfs_rq->load_last_update_time_copy = sa->last_update_time;
> #endif
>
> - return decayed;
> + return decayed | updated;
> }
>
> /* Update task and its cfs_rq load average */


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/