Re: [patch 1/1] sched: update_curr versus correct cfs_rq in check_preempt_wakeup

From: Paul Turner
Date: Tue Jul 05 2011 - 22:07:59 EST


On Sat, Jul 2, 2011 at 3:27 AM, Ingo Molnar <mingo@xxxxxxx> wrote:
>
> * Peter Zijlstra <a.p.zijlstra@xxxxxxxxx> wrote:
>
>> On Sat, 2011-07-02 at 11:08 +0200, Ingo Molnar wrote:
>> > * Paul Turner <pjt@xxxxxxxxxx> wrote:
>> >
>> > > We update_curr() versus the current entity as the preemption
>> > > decision is based on the relative vruntime.  However, update_curr()
>> > > is not hierarchical and in the group scheduling case
>> > > find_matching_se() will have us making the comparison on a cfs_rq
>> > > different to the one just updated.
>> >
>> > Would be nice to include more contextual information in the
>> > changelog: how did you find it, what effect (if any) did you see
>> > from this patch, what effect do you expect others to see (if
>> > any).
>>
>> Agreed that the Changelog can be improved. From talking to pjt on
>> IRC though, I think he spotted this by reading through the code.
>
> 'code review' is a perfect answer to the 'how did you find it'

Sure, sorry for omitting this -- updated below.

> question: when people read the changelog they will know that no
> practical effect has been observed (yet).

So there should definitely be a measurable practical effect; for the
running task we are potentially leaving up to a tick of execution
unaccounted in deciding whether or not we cross the wakeup_gran to
preempt.

The fact that this drift is bounded above by our vruntime updates
within entity_tick largely masks the negative effects of this.

>
> Thanks,
>
>        Ingo
>


sched: update correct entity's runtime in check_preempt_wakeup()

While looking at check_preempt_wakeup() I realized that we are potentially
updating the wrong entity in the fair-group scheduling case. In this case
the current task's cfs_rq may not be the same as the one used for the
comparison between the waking task and the existing task's vruntime.

This potentially results in us using a stale vruntime in the pre-emption
decision, providing a small false preference for the previous task. The
effects of this are bounded since we always perform a hierarchal update on the
tick.

Signed-off-by: Paul Turner <pjt@xxxxxxxxxx>

---
kernel/sched_fair.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

Index: tip2/kernel/sched_fair.c
===================================================================
--- tip2.orig/kernel/sched_fair.c
+++ tip2/kernel/sched_fair.c
@@ -1919,8 +1919,8 @@ static void check_preempt_wakeup(struct
if (!sched_feat(WAKEUP_PREEMPT))
return;

- update_curr(cfs_rq);
find_matching_se(&se, &pse);
+ update_curr(cfs_rq_of(se));
BUG_ON(!pse);
if (wakeup_preempt_entity(se, pse) == 1) {
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/