[PATCH] schedule inefficiency

Rik van Riel (H.H.vanRiel@phys.uu.nl)
Sun, 27 Sep 1998 16:28:16 +0200 (CEST)


Hi,

in response to the possibility of a possible improvement of
the scheduler, I've written the a simple patch.

The patch does:
- make sure the scheduler only recalculates the priorities
of running processes
- the priority of other processes is calculated at the
moment they are added to the runqueue
- the system feels a lot more responsive under high loads
- remove an unused field from the task struct (p->dec_flt)

The last point is due to the load-independent priority
recharge in add_to_runqueue(). CPU hogs stay on the
runqueue, giving them a load-dependent recharge. This
puts CPU hogs at a slight disadvantage compared to
interactive processes (which sleep all the time).

There is also an inbuilt provision for niced tasks
that sleep (block on disk) all the time. Their priority
isn't recharged as fast, hopefully making sure that
niced tasks don't grab as much I/O resources as they
do under the old scheme.

The patch runs stably (I'm running it for 15 minutes now)
even when heavily abused (current load > 12). With 10
CPU loops running the system even remains responsive, under
X :)

Since this patch offers quite a performance improvement
and it is trivial, we might even consider inclusion before
2.2, but that's something we need to decide on later.

Rik.
+-------------------------------------------------------------------+
| Linux memory management tour guide. H.H.vanRiel@phys.uu.nl |
| Scouting Vries cubscout leader. http://www.phys.uu.nl/~riel/ |
+-------------------------------------------------------------------+

--- linux-2.1.122/kernel/sched.c Fri Sep 25 16:43:12 1998
+++ linux-local/kernel/sched.c Sun Sep 27 15:31:56 1998
@@ -157,6 +157,11 @@
{
struct task_struct *next = init_task.next_run;

+ p->counter += ((jiffies - p->sleep_time) * p->priority) / DEF_PRIORITY;
+ /* if we sleep 'too long' or if jiffies overflows, the value
+ will be normalized here (p->counter is signed) */
+ if (p->counter > 2 * p->priority || p->counter < 0)
+ p->counter = 2 * p->priority;
p->prev_run = &init_task;
init_task.next_run = p;
p->next_run = next;
@@ -173,6 +178,7 @@
prev->next_run = next;
p->next_run = NULL;
p->prev_run = NULL;
+ p->sleep_time = jiffies;
}

static inline void move_last_runqueue(struct task_struct * p)
@@ -537,8 +543,12 @@
if (!c) {
struct task_struct *p;
read_lock(&tasklist_lock);
- for_each_task(p)
+ p = init_task.next_run;
+ while (p != &init_task) {
+ /* maybe someone added a task, keep the p->counter */
p->counter = (p->counter >> 1) + p->priority;
+ p = p->next_run;
+ }
read_unlock(&tasklist_lock);
}
}
--- linux-2.1.122/include/linux/sched.h Fri Sep 25 16:44:06 1998
+++ linux-local/include/linux/sched.h Sun Sep 27 15:17:49 1998
@@ -259,6 +259,7 @@

struct wait_queue *wait_chldexit; /* for wait4() */
unsigned long timeout, policy, rt_priority;
+ unsigned long sleep_time; /* when did we go to sleep? */
unsigned long it_real_value, it_prof_value, it_virt_value;
unsigned long it_real_incr, it_prof_incr, it_virt_incr;
struct timer_list real_timer;
@@ -270,7 +271,6 @@
int swappable:1;
unsigned long swap_address;
unsigned long old_maj_flt; /* old value of maj_flt */
- unsigned long dec_flt; /* page fault count of the last time */
unsigned long swap_cnt; /* number of pages to swap on next pass */
/* process credentials */
uid_t uid,euid,suid,fsuid;
@@ -348,12 +348,12 @@
/* pidhash */ NULL, NULL, \
/* tarray */ &task[0], \
/* chld wait */ NULL, \
-/* timeout */ 0,SCHED_OTHER,0,0,0,0,0,0,0, \
+/* timeout */ 0,SCHED_OTHER,0,0,0,0,0,0,0,0, \
/* timer */ { NULL, NULL, 0, 0, it_real_fn }, \
/* utime */ {0,0,0,0},0, \
/* per CPU times */ {0, }, {0, }, \
/* flt */ 0,0,0,0,0,0, \
-/* swp */ 0,0,0,0,0, \
+/* swp */ 0,0,0,0, \
/* process credentials */ \
/* uid etc */ 0,0,0,0,0,0,0,0, \
/* suppl grps*/ 0, {0,}, \

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/