Re: [RFC][PATCH] Improving directed yield scalability for PLE handler

From: Raghavendra K T
Date: Tue Sep 11 2012 - 02:11:51 EST

On 09/11/2012 01:42 AM, Andrew Theurer wrote:
On Mon, 2012-09-10 at 19:12 +0200, Peter Zijlstra wrote:
On Mon, 2012-09-10 at 22:26 +0530, Srikar Dronamraju wrote:
+static bool __yield_to_candidate(struct task_struct *curr, struct task_struct *p)
+ if (!curr->sched_class->yield_to_task)
+ return false;
+ if (curr->sched_class != p->sched_class)
+ return false;


Should we also add a check if the runq has a skip buddy (as pointed out
by Raghu) and return if the skip buddy is already set.

Oh right, I missed that suggestion.. the performance improvement went
from 81% to 139% using this, right?

It might make more sense to keep that separate, outside of this
function, since its not a strict prerequisite.

+ if (task_running(p_rq, p) || p->state)
+ return false;
+ return true;

@@ -4323,6 +4340,10 @@ bool __sched yield_to(struct task_struct *p,
bool preempt)
rq = this_rq();

+ /* optimistic test to avoid taking locks */
+ if (!__yield_to_candidate(curr, p))
+ goto out_irq;

So add something like:

/* Optimistic, if we 'raced' with another yield_to(), don't bother */
if (p_rq->cfs_rq->skip)
goto out_irq;

p_rq = task_rq(p);
double_rq_lock(rq, p_rq);

But I do have a question on this optimization though,.. Why do we check
p_rq->cfs_rq->skip and not rq->cfs_rq->skip ?

That is, I'd like to see this thing explained a little better.

Does it go something like: p_rq is the runqueue of the task we'd like to
yield to, rq is our own, they might be the same. If we have a ->skip,
there's nothing we can do about it, OTOH p_rq having a ->skip and
failing the yield_to() simply means us picking the next VCPU thread,
which might be running on an entirely different cpu (rq) and could

Here's two new versions, both include a __yield_to_candidate(): "v3"
uses the check for p_rq->curr in guest mode, and "v4" uses the cfs_rq
skip check. Raghu, I am not sure if this is exactly what you want
implemented in v4.

Andrew, Yes that is what I had. I think there was a mis-understanding. My intention was to if there is a directed_yield happened in runqueue (say rqA), do not bother to directed yield to that. But unfortunately as PeterZ pointed that would have resulted in setting next buddy of a different run queue than rqA.
So we can drop this "skip" idea. Pondering more over what to do? can we use next buddy itself ... thinking..

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at