Re: Please, put 64-bit counter per task and incr.by.one each ctxt switch.

From: J.C. Pizarro
Date: Sat Feb 23 2008 - 23:09:40 EST


On 2008/2/24, Rik van Riel <riel@xxxxxxxxxx> wrote:
> On Sun, 24 Feb 2008 04:08:38 +0100
> "J.C. Pizarro" <jcpiza@xxxxxxxxx> wrote:
>
> > We will need 64 bit counters of the slow context switches,
> > one counter for each new created task (e.g. u64 ctxt_switch_counts;)
>
>
> Please send a patch ...

diff -ur linux-2.6_git-20080224.orig/include/linux/sched.h
linux-2.6_git-20080224/include/linux/sched.h
--- linux-2.6_git-20080224.orig/include/linux/sched.h 2008-02-24
01:04:18.000000000 +0100
+++ linux-2.6_git-20080224/include/linux/sched.h 2008-02-24
04:50:18.000000000 +0100
@@ -1007,6 +1007,12 @@
struct hlist_head preempt_notifiers;
#endif

+ unsigned long long ctxt_switch_counts; /* 64-bit switches' count */
+ /* ToDo:
+ * To implement a poller/clock for CPU-scheduler that only reads
+ * these counts of context switches of the runqueue's tasks.
+ * No problem if this poller/clock is not implemented. */
+
/*
* fpu_counter contains the number of consecutive context switches
* that the FPU is used. If this is over a threshold, the lazy fpu
diff -ur linux-2.6_git-20080224.orig/kernel/sched.c
linux-2.6_git-20080224/kernel/sched.c
--- linux-2.6_git-20080224.orig/kernel/sched.c 2008-02-24
01:04:19.000000000 +0100
+++ linux-2.6_git-20080224/kernel/sched.c 2008-02-24
04:33:57.000000000 +0100
@@ -2008,6 +2008,8 @@
BUG_ON(p->state != TASK_RUNNING);
update_rq_clock(rq);

+ p->ctxt_switch_counts = 0ULL; /* task's 64-bit counter inited 0 */
+
p->prio = effective_prio(p);

if (!p->sched_class->task_new || !current->se.on_rq) {
@@ -2189,8 +2191,14 @@
context_switch(struct rq *rq, struct task_struct *prev,
struct task_struct *next)
{
+ unsigned long flags;
+ struct rq *rq_prev;
struct mm_struct *mm, *oldmm;

+ rq_prev = task_rq_lock(prev, &flags); /* locking the prev task */
+ prev->ctxt_switch_counts++; /* incr.+1 the task's 64-bit counter */
+ task_rq_unlock(rq_prev, &flags); /* unlocking the prev task */
+
prepare_task_switch(rq, prev, next);
mm = next->mm;
oldmm = prev->active_mm;

> > I will explain your later why of it.
>
>
> ... and explain exactly why the kernel needs this extra code.

One reason: for the objective of gain interactivity, it's an issue that
CFS fair scheduler lacks it.

o:)

Attachment: linux-2.6_git-20080224_ctxt_switch_counts.patch
Description: Binary data