Re: [RFC PATCH v2 11/17] sched: Basic tracking of matching tasks

From: Aaron Lu
Date: Sun Apr 28 2019 - 23:36:55 EST


On Tue, Apr 23, 2019 at 04:18:16PM +0000, Vineeth Remanan Pillai wrote:
> +/*
> + * l(a,b)
> + * le(a,b) := !l(b,a)
> + * g(a,b) := l(b,a)
> + * ge(a,b) := !l(a,b)
> + */
> +
> +/* real prio, less is less */
> +static inline bool __prio_less(struct task_struct *a, struct task_struct *b, bool core_cmp)
> +{
> + u64 vruntime;
> +
> + int pa = __task_prio(a), pb = __task_prio(b);
> +
> + if (-pa < -pb)
> + return true;
> +
> + if (-pb < -pa)
> + return false;
> +
> + if (pa == -1) /* dl_prio() doesn't work because of stop_class above */
> + return !dl_time_before(a->dl.deadline, b->dl.deadline);
> +
> + vruntime = b->se.vruntime;
> + if (core_cmp) {
> + vruntime -= task_cfs_rq(b)->min_vruntime;
> + vruntime += task_cfs_rq(a)->min_vruntime;
> + }
> + if (pa == MAX_RT_PRIO + MAX_NICE) /* fair */
> + return !((s64)(a->se.vruntime - vruntime) <= 0);
> +
> + return false;
> +}

This unfortunately still doesn't work.

Consider the following task layout on two sibling CPUs(cpu0 and cpu1):

rq0.cfs_rq rq1.cfs_rq
| |
se_bash se_hog

se_hog is the sched_entity for a cpu intensive task and se_bash is the
sched_entity for bash.

There are two problems:
1 SCHED_DEBIT
when user execute some commands through bash, say ls, bash will fork.
The newly forked ls' vruntime is set in the future due to SCHED_DEBIT.
This made 'ls' lose in __prio_less() when compared with hog, whose
vruntime may very likely be the same as its cfs_rq's min_vruntime.

This is OK since we do not want forked process to starve already running
ones. The problem is, since hog keeps running, its vruntime will always
sync with its cfs_rq's min_vruntime. OTOH, 'ls' can not run, its
cfs_rq's min_vruntime doesn't proceed, making 'ls' always lose to hog.

2 who schedules, who wins
so I disabled SCHED_DEBIT, for testing's purpose. When cpu0 schedules,
ls could win where both sched_entity's vruntime is the same as their
cfs_rqs' min_vruntime. So does hog: when cpu1 schedules, hog can preempt
ls in the same way. The end result is, interactive task can lose to cpu
intensive task and ls can feel "dead".

I haven't figured out a way to solve this yet. A core wide cfs_rq's
min_vruntime can probably solve this. Your suggestions are appreciated.