Re: Make yield_task_fair more efficient

From: Mike Galbraith
Date: Thu Feb 21 2008 - 07:02:50 EST


Hi,

On Thu, 2008-02-21 at 15:01 +0530, Balbir Singh wrote:
> Ingo Molnar wrote:
> > * Balbir Singh <balbir@xxxxxxxxxxxxxxxxxx> wrote:
> >
> >> If you insist that sched_yield() is bad, I might agree, but how does
> >> my patch make things worse. [...]
> >
> > it puts new instructions into the hotpath.
> >
> >> [...] In my benchmarks, it has helped the sched_yield case, why is
> >> that bad? [...]
> >
> > I had the same cache for the rightmost task in earlier CFS (it's a
> > really obvious thing) but removed it. It wasnt a bad idea, but it hurt
> > the fastpath hence i removed it. Algorithms and implementations are a
> > constant balancing act.
>
> This is more convincing, was the code ever in git? How did you measure the
> overhead?

Counting enqueue/dequeue cycles on my 3GHz P4/HT running a 60 seconds
netperf test that does ~85k/s context switches shows:

sched_cycles: 7198444348 unpatched
vs
sched_cycles: 8574036268 patched

-Mike

diff --git a/include/linux/sched.h b/include/linux/sched.h
index e217d18..a71edfe 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1262,6 +1262,7 @@ struct task_struct {
int latency_record_count;
struct latency_record latency_record[LT_SAVECOUNT];
#endif
+ unsigned long long sched_cycles;
};

/*
diff --git a/kernel/exit.c b/kernel/exit.c
index 506a957..c5c3d5e 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -174,6 +174,8 @@ repeat:
}

write_unlock_irq(&tasklist_lock);
+ if (!strcmp("netperf", p->comm) || !strcmp("netserver", p->comm))
+ printk(KERN_WARNING "%s:%d sched_cycles: %Ld\n", p->comm, p->pid, p->sched_cycles);
release_thread(p);
call_rcu(&p->rcu, delayed_put_task_struct);

diff --git a/kernel/sched.c b/kernel/sched.c
index f28f19e..d7c1b0b 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -1301,15 +1301,25 @@ static void set_load_weight(struct task_struct *p)

static void enqueue_task(struct rq *rq, struct task_struct *p, int wakeup)
{
+ unsigned long long then, now;
+
+ rdtscll(then);
sched_info_queued(p);
p->sched_class->enqueue_task(rq, p, wakeup);
p->se.on_rq = 1;
+ rdtscll(now);
+ p->sched_cycles += now - then;
}

static void dequeue_task(struct rq *rq, struct task_struct *p, int sleep)
{
+ unsigned long long then, now;
+
+ rdtscll(then);
p->sched_class->dequeue_task(rq, p, sleep);
p->se.on_rq = 0;
+ rdtscll(now);
+ p->sched_cycles += now - then;
}

/*
@@ -2009,6 +2019,7 @@ void wake_up_new_task(struct task_struct *p, unsigned long clone_flags)
update_rq_clock(rq);

p->prio = effective_prio(p);
+ p->sched_cycles = 0;

if (!p->sched_class->task_new || !current->se.on_rq) {
activate_task(rq, p, 0);


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/