[PATCH] scheduler cleanup

Artur Skawina (skawina@geocities.com)
Wed, 13 Oct 1999 02:17:37 +0100


This is a multi-part message in MIME format.

--------------221648906DCC76366B22FDB2
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Due to popular demand i've extracted just the scheduler fixes. ;)

Highlights (in +/- random order):

o SCHED_RR threads work. (they used to be moved to the end of the
runqueue, had their counter reset, but were still preferred for
selection at the end of their timeslice. IOW unless they blocked
they were not preemted, not unlike SCHED_FIFO)

o schedule_others() introduced and all sched_yield()-type
manipulations outside the scheduler killed.

o code that previously used to do "policy|=SCHED_YIELD;schedule()"
and expected lower priority threads to be scheduled now works;
even when used by a RT thread (SCHED_YIELD didn't yield in that
case)

o sched_yield() works for RT threads, ie if there is another RT
process with equal priority, the other one is selected.

o sched_yield() works better for SCHED_OTHER threads, ie a process
calling sched_yield() won't continue to run if there are other
"normal" processes waiting for a CPU

o sched_yield() does not cause rescheduling, when appropriate.
eg if you have only two processes running on a dual SMP box and
one of them calls sched_yield() the call is ignored. Ditto for
a UP machine when there is no other process ready to run.
There's no point in rescheduling if everything that wants to
run is already running...

o FIFO thread wake up order (used to be LIFO).
Keeping the runnable-but-not-yet-running state as short as
possible minimizes the chances of RT threads missing a deadline;
being more fair it also seems to make sense for "normal" threads.
[note: this is the only significant change wrt behaviour.
reverting back to LIFO would only be a matter of changing the
"add_to_runqueue_last()" back to "add_to_runqueue()" in
wake_up_process()]

o sched_setscheduler() and sched_setparam() calls move the thread
to be last in its priority class (they used to make the thread
the first one in its class)

o sched_rr_get_interval() returns the real value of a threads
RR time quantum (it used to return a constant, bogus value)
[the quantum can be altered with nice(2)/setpriority(2) on a per
thread basis (~1ms .. ~400ms range)]

o nanosleep() yields the cpu to higher priority RT threads even
when busywaiting. noticed by Ingo Molnar (different fix)

o Several obsolete and misleading comments gone. One would wish
people would update the comments when changing the code. Or at
least delete them. Bogus comments are far worse than none at all.

o drivers/ap1000/ddv.c quick fix (it guess it used to work by
accident, this just prevents it from breaking further; should be
fixed properly though)

patch vs stock 2.3.21:

--------------221648906DCC76366B22FDB2
Content-Type: text/plain; charset=us-ascii; name="patch-2.3.21-scheduler"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline; filename="patch-2.3.21-scheduler"

diff -urNp /img/linux-2.3.21/drivers/ap1000/ddv.c linux-2.3.21s1/drivers/ap1000/ddv.c
--- /img/linux-2.3.21/drivers/ap1000/ddv.c Thu May 20 01:59:05 1999
+++ linux-2.3.21s1/drivers/ap1000/ddv.c Wed Oct 13 00:51:49 1999
@@ -371,7 +371,7 @@ static int ddv_daemon(void *unused)

/* Give it a realtime priority. */
current->policy = SCHED_FIFO;
- current->priority = 32; /* Fixme --- we need to standardise our
+ current->rt_priority = 1000+32; /* Fixme --- we need to standardise our
namings for POSIX.4 realtime scheduling
priorities. */

diff -urNp /img/linux-2.3.21/fs/buffer.c linux-2.3.21s1/fs/buffer.c
--- /img/linux-2.3.21/fs/buffer.c Tue Oct 12 01:47:23 1999
+++ linux-2.3.21s1/fs/buffer.c Wed Oct 13 00:51:49 1999
@@ -667,8 +667,7 @@ static void refill_freelist(int size)
{
if (!grow_buffers(size)) {
wakeup_bdflush(1);
- current->policy |= SCHED_YIELD;
- schedule();
+ schedule_others();
}
}

diff -urNp /img/linux-2.3.21/fs/ext2/truncate.c linux-2.3.21s1/fs/ext2/truncate.c
--- /img/linux-2.3.21/fs/ext2/truncate.c Thu Jul 1 17:49:56 1999
+++ linux-2.3.21s1/fs/ext2/truncate.c Wed Oct 13 00:51:49 1999
@@ -380,8 +380,7 @@ void ext2_truncate (struct inode * inode
if (IS_SYNC(inode) && (inode->i_state & I_DIRTY))
ext2_sync_inode (inode);
run_task_queue(&tq_disk);
- current->policy |= SCHED_YIELD;
- schedule();
+ schedule_others();
}
inode->i_mtime = inode->i_ctime = CURRENT_TIME;
mark_inode_dirty(inode);
diff -urNp /img/linux-2.3.21/fs/locks.c linux-2.3.21s1/fs/locks.c
--- /img/linux-2.3.21/fs/locks.c Fri Aug 27 13:13:14 1999
+++ linux-2.3.21s1/fs/locks.c Wed Oct 13 00:51:49 1999
@@ -275,8 +275,7 @@ static void locks_wake_up_blocks(struct
/* Let the blocked process remove waiter from the
* block list when it gets scheduled.
*/
- current->policy |= SCHED_YIELD;
- schedule();
+ schedule_others();
} else {
/* Remove waiter from the block list, because by the
* time it wakes up blocker won't exist any more.
diff -urNp /img/linux-2.3.21/fs/minix/truncate.c linux-2.3.21s1/fs/minix/truncate.c
--- /img/linux-2.3.21/fs/minix/truncate.c Fri Aug 27 13:13:03 1999
+++ linux-2.3.21s1/fs/minix/truncate.c Wed Oct 13 00:51:49 1999
@@ -189,8 +189,7 @@ static void V1_minix_truncate(struct ino
retry |= V1_trunc_dindirect(inode, 7+512, inode->u.minix_i.u.i1_data + 8);
if (!retry)
break;
- current->counter = 0;
- schedule();
+ schedule_others();
}
inode->i_mtime = inode->i_ctime = CURRENT_TIME;
mark_inode_dirty(inode);
@@ -401,8 +400,7 @@ static void V2_minix_truncate(struct ino
if (!retry)
break;
run_task_queue(&tq_disk);
- current->policy |= SCHED_YIELD;
- schedule();
+ schedule_others();
}
inode->i_mtime = inode->i_ctime = CURRENT_TIME;
mark_inode_dirty(inode);
diff -urNp /img/linux-2.3.21/fs/sysv/truncate.c linux-2.3.21s1/fs/sysv/truncate.c
--- /img/linux-2.3.21/fs/sysv/truncate.c Thu Jul 1 17:50:00 1999
+++ linux-2.3.21s1/fs/sysv/truncate.c Wed Oct 13 00:51:49 1999
@@ -284,10 +284,8 @@ void sysv_truncate(struct inode * inode)
printk("sysv_truncate: truncating symbolic link\n");
else if (!(S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode)))
return;
- while (trunc_all(inode)) {
- current->counter = 0;
- schedule();
- }
+ while (trunc_all(inode))
+ schedule_others();
inode->i_mtime = inode->i_ctime = CURRENT_TIME;
mark_inode_dirty(inode);
}
diff -urNp /img/linux-2.3.21/fs/ufs/truncate.c linux-2.3.21s1/fs/ufs/truncate.c
--- /img/linux-2.3.21/fs/ufs/truncate.c Fri Aug 27 13:13:03 1999
+++ linux-2.3.21s1/fs/ufs/truncate.c Wed Oct 13 00:51:49 1999
@@ -456,10 +456,7 @@ void ufs_truncate (struct inode * inode)
if (IS_SYNC(inode) && (inode->i_state & I_DIRTY))
ufs_sync_inode (inode);
run_task_queue(&tq_disk);
- current->policy |= SCHED_YIELD;
- schedule ();
-
-
+ schedule_others();
}
offset = inode->i_size & uspi->s_fshift;
if (offset) {
diff -urNp /img/linux-2.3.21/include/linux/sched.h linux-2.3.21s1/include/linux/sched.h
--- /img/linux-2.3.21/include/linux/sched.h Tue Oct 12 01:47:23 1999
+++ linux-2.3.21s1/include/linux/sched.h Wed Oct 13 00:51:49 1999
@@ -84,23 +84,23 @@ extern int last_pid;
#define TASK_EXCLUSIVE 32

#define __set_task_state(tsk, state_value) \
- do { tsk->state = state_value; } while (0)
+ do { (tsk)->state = (state_value); } while (0)
#ifdef __SMP__
#define set_task_state(tsk, state_value) \
- set_mb(tsk->state, state_value)
+ set_mb((tsk)->state, (state_value))
#else
#define set_task_state(tsk, state_value) \
- __set_task_state(tsk, state_value)
+ __set_task_state((tsk), (state_value))
#endif

#define __set_current_state(state_value) \
- do { current->state = state_value; } while (0)
+ do { current->state = (state_value); } while (0)
#ifdef __SMP__
#define set_current_state(state_value) \
- set_mb(current->state, state_value)
+ set_mb(current->state, (state_value))
#else
#define set_current_state(state_value) \
- __set_current_state(state_value)
+ __set_current_state((state_value))
#endif

/*
@@ -129,10 +129,7 @@ struct sched_param {
#include <linux/spinlock.h>

/*
- * This serializes "schedule()" and also protects
- * the run-queue from deletions/modifications (but
- * _adding_ to the beginning of the run-queue has
- * a separate lock).
+ * See kernel/sched.c for info on these locks.
*/
extern rwlock_t tasklist_lock;
extern spinlock_t runqueue_lock;
@@ -147,6 +144,7 @@ extern void update_one_process( struct t

#define MAX_SCHEDULE_TIMEOUT LONG_MAX
extern signed long FASTCALL(schedule_timeout(signed long timeout));
+extern void schedule_others(void);
asmlinkage void schedule(void);

/*
@@ -313,7 +311,8 @@ struct task_struct {

wait_queue_head_t wait_chldexit; /* for wait4() */
struct semaphore *vfork_sem; /* for vfork() */
- unsigned long policy, rt_priority;
+ unsigned long policy;
+ long rt_priority;
unsigned long it_real_value, it_prof_value, it_virt_value;
unsigned long it_real_incr, it_prof_incr, it_virt_incr;
struct timer_list real_timer;
diff -urNp /img/linux-2.3.21/ipc/msg.c linux-2.3.21s1/ipc/msg.c
--- /img/linux-2.3.21/ipc/msg.c Sun Oct 10 23:01:07 1999
+++ linux-2.3.21s1/ipc/msg.c Wed Oct 13 00:51:49 1999
@@ -208,8 +208,7 @@ static void freeque (int id)
while(waitqueue_active(&msq->q_rwait)) {
wake_up(&msq->q_rwait);
spin_unlock(&msg_que[id].lock);
- current->policy |= SCHED_YIELD;
- schedule();
+ schedule_others();
spin_lock(&msg_que[id].lock);
}
spin_unlock(&msg_que[id].lock);
diff -urNp /img/linux-2.3.21/kernel/sched.c linux-2.3.21s1/kernel/sched.c
--- /img/linux-2.3.21/kernel/sched.c Fri Aug 27 13:13:14 1999
+++ linux-2.3.21s1/kernel/sched.c Wed Oct 13 00:51:49 1999
@@ -15,6 +15,7 @@
* Copyright (C) 1998 Andrea Arcangeli
* 1998-12-28 Implemented better SMP scheduling by Ingo Molnar
* 1999-03-10 Improved NTP compatibility by Ulrich Windl
+ * 1999-10-13 Various scheduler cleanups by Artur Skawina
*/

/*
@@ -84,8 +85,6 @@ unsigned int * prof_buffer = NULL;
unsigned long prof_len = 0;
unsigned long prof_shift = 0;

-extern void mem_use(void);
-
unsigned long volatile jiffies=0;

/*
@@ -98,16 +97,11 @@ struct task_struct * init_tasks[NR_CPUS]
/*
* The tasklist_lock protects the linked list of processes.
*
- * The scheduler lock is protecting against multiple entry
- * into the scheduling code, and doesn't need to worry
- * about interrupts (because interrupts cannot call the
- * scheduler).
- *
* The run-queue lock locks the parts that actually access
* and change the run-queues, and have to be interrupt-safe.
*/
-spinlock_t runqueue_lock = SPIN_LOCK_UNLOCKED; /* second */
-rwlock_t tasklist_lock = RW_LOCK_UNLOCKED; /* third */
+spinlock_t runqueue_lock = SPIN_LOCK_UNLOCKED; /* acquired first */
+rwlock_t tasklist_lock = RW_LOCK_UNLOCKED; /* acquired second */

static LIST_HEAD(runqueue_head);

@@ -155,17 +149,27 @@ void scheduling_functions_start_here(voi
* +1000: realtime process, select this.
*/

+#define GOODNESS_MIN (-1000) /* goodness value of the real idle task(s) */
+#define GOODNESS_MAX (1000) /* max 'normal' goodness; RT processes have more */
+
static inline int goodness(struct task_struct * p, int this_cpu, struct mm_struct *this_mm)
{
int weight;

- /*
- * Realtime process, select the first one on the
- * runqueue (taking priorities within processes
- * into account).
- */
if (p->policy != SCHED_OTHER) {
- weight = 1000 + p->rt_priority;
+ /* We get here if:
+ * o This is a FIFO or RR process.
+ * Return the static priority (set by user).
+ * For realtime this is 1001...1100.
+ * o This process came from sched_yield() [has SCHED_YIELD flag].
+ * This process wants to give others a chance to run.
+ * It is still runnable, has already been moved to the end
+ * of the runqueue, and won't get any preferential treatment.
+ * For SCHED_OTHER processes return 0 (note this also
+ * causes counter recalculation if this process wins).
+ * For FIFO and RR processes return static priority.
+ */
+ weight = p->rt_priority;
goto out;
}

@@ -197,24 +201,6 @@ out:
}

/*
- * subtle. We want to discard a yielded process only if it's being
- * considered for a reschedule. Wakeup-time 'queries' of the scheduling
- * state do not count. Another optimization we do: sched_yield()-ed
- * processes are runnable (and thus will be considered for scheduling)
- * right when they are calling schedule(). So the only place we need
- * to care about SCHED_YIELD is when we calculate the previous process'
- * goodness ...
- */
-static inline int prev_goodness(struct task_struct * p, int this_cpu, struct mm_struct *this_mm)
-{
- if (p->policy & SCHED_YIELD) {
- p->policy &= ~SCHED_YIELD;
- return 0;
- }
- return goodness(p, this_cpu, this_mm);
-}
-
-/*
* the 'goodness value' of replacing a process on a given CPU.
* positive value means 'replace', zero or negative means 'dont'.
*/
@@ -287,19 +273,19 @@ out_no_target:
#endif
}

-/*
- * Careful!
- *
- * This has to add the process to the _beginning_ of the
- * run-queue, not the end. See the comment about "This is
- * subtle" in the scheduler proper..
- */
+
static inline void add_to_runqueue(struct task_struct * p)
{
list_add(&p->run_list, &runqueue_head);
nr_running++;
}

+static inline void add_to_runqueue_last(struct task_struct * p)
+{
+ list_add_tail(&p->run_list, &runqueue_head);
+ nr_running++;
+}
+
static inline void move_last_runqueue(struct task_struct * p)
{
list_del(&p->run_list);
@@ -331,7 +317,7 @@ void wake_up_process(struct task_struct
p->state = TASK_RUNNING;
if (task_on_runqueue(p))
goto out;
- add_to_runqueue(p);
+ add_to_runqueue_last(p);
spin_unlock_irqrestore(&runqueue_lock, flags);

reschedule_idle(p);
@@ -532,6 +518,27 @@ signed long schedule_timeout(signed long
}

/*
+ * schedule_others() gives other threads that are waiting
+ * for a CPU a chance to run.
+ * Note this differs from sched_yield() -- this doesn't
+ * exclude "normal" threads, even when called by a RT task.
+ */
+
+void schedule_others(void)
+{
+ if (current->policy & (SCHED_FIFO|SCHED_RR))
+ { /* if we are a RT thread */
+ current->state = TASK_INTERRUPTIBLE; /* we have to try harder */
+ schedule_timeout(1);
+ }
+ else
+ {
+ current->policy |= SCHED_YIELD;
+ schedule();
+ }
+}
+
+/*
* schedule_tail() is getting called from the fork return path. This
* cleans up all remaining scheduler things, without impacting the
* common case.
@@ -575,8 +582,11 @@ asmlinkage void schedule(void)
tq_scheduler_back:

prev = current;
+#ifdef __SMP__
this_cpu = prev->processor;
-
+#else
+ this_cpu = smp_processor_id(); /* it's not like we have more than one */
+#endif
if (in_interrupt())
goto scheduling_in_interrupt;

@@ -596,7 +606,7 @@ handle_bh_back:
spin_lock_irq(&runqueue_lock);

/* move an exhausted RR process to be last.. */
- if (prev->policy == SCHED_RR)
+ if (prev->policy&SCHED_RR)
goto move_rr_last;
move_rr_back:

@@ -621,24 +631,25 @@ repeat_schedule:
* Default process to select..
*/
next = idle_task(this_cpu);
- c = -1000;
+ c = GOODNESS_MIN;
+
if (prev->state == TASK_RUNNING)
goto still_running;
still_running_back:

- tmp = runqueue_head.next;
- while (tmp != &runqueue_head) {
+ for (tmp = runqueue_head.next; tmp != &runqueue_head; tmp = tmp->next) {
p = list_entry(tmp, struct task_struct, run_list);
if (can_schedule(p)) {
int weight = goodness(p, this_cpu, prev->active_mm);
if (weight > c)
c = weight, next = p;
}
- tmp = tmp->next;
}

+ prev->policy &= ~SCHED_YIELD;
+
/* Do we need to re-calculate counters? */
- if (!c)
+ if (c==0)
goto recalculate;
/*
* from this point on nothing can prevent us from
@@ -677,13 +688,6 @@ still_running_back:
*/
prev->avg_slice = (this_slice*1 + prev->avg_slice*1)/2;
}
-
- /*
- * We drop the scheduler lock early (it's a global spinlock),
- * thus we have to lock the previous process from getting
- * rescheduled during switch_to().
- */
-
#endif /* __SMP__ */

kstat.context_swtch++;
@@ -739,8 +743,15 @@ recalculate:
goto repeat_schedule;

still_running:
- c = prev_goodness(prev, this_cpu, prev->active_mm);
- next = prev;
+ /* If last thread is still running prefer it over others
+ * with equal goodness. Except if it has just called
+ * sched_yield(), then treat it like any other thread.
+ */
+ if (!(prev->policy&SCHED_YIELD))
+ {
+ c = goodness(prev, this_cpu, prev->active_mm);
+ next = prev;
+ }
goto still_running_back;

handle_bh:
@@ -752,8 +763,9 @@ handle_tq_scheduler:
goto tq_scheduler_back;

move_rr_last:
- if (!prev->counter) {
- prev->counter = prev->priority;
+ if (prev->counter==0) { /* current thread used up its quantum? */
+ prev->counter = prev->priority; /* then reset it */
+ prev->policy |= SCHED_YIELD; /* and simulate a sched_yield() */
move_last_runqueue(prev);
}
goto move_rr_back;
@@ -1540,8 +1552,16 @@ static int setscheduler(pid_t pid, int p
retval = 0;
p->policy = policy;
p->rt_priority = lp.sched_priority;
+ switch ( policy ) /* move rt_priority into proper range, update flags */
+ {
+ case SCHED_RR:
+ case SCHED_FIFO: p->rt_priority += GOODNESS_MAX;
+ default:
+ /*case SCHED_OTHER:*/
+ }
+
if (task_on_runqueue(p))
- move_first_runqueue(p);
+ move_last_runqueue(p);

current->need_resched = 1;

@@ -1605,6 +1625,11 @@ asmlinkage long sys_sched_getparam(pid_t
if (!p)
goto out_unlock;
lp.sched_priority = p->rt_priority;
+ switch ( p->policy )
+ {
+ default: /* case SCHED_OTHER: */ break;
+ case SCHED_RR: case SCHED_FIFO: lp.sched_priority -= GOODNESS_MAX; break;
+ }
read_unlock(&tasklist_lock);

/*
@@ -1622,12 +1647,25 @@ out_unlock:

asmlinkage long sys_sched_yield(void)
{
- spin_lock_irq(&runqueue_lock);
- if (current->policy == SCHED_OTHER)
+/*
+ * A thread calling sched_yield() wants to give up its timeslice and let
+ * other equal priority threads to run.
+ * We optimize by ignoring the request completely when the number of
+ * currently runnable processes is <= the number of available CPUs.
+ * IOW on UP sched_yield() does nothing if this is the only process; on
+ * MP we don't reschedule if there are no processes waiting for a CPU.
+ * [note 'nr_running' is accessed w/o proper locking, but that's ok
+ * -- an application can not use yield() as a synchronization mechanism
+ * anyway -- if we sometimes miss or do an extra reschedule that's fine]
+ */
+ if ( nr_running>smp_num_cpus )
+ {
+ spin_lock_irq(&runqueue_lock);
current->policy |= SCHED_YIELD;
- current->need_resched = 1;
- move_last_runqueue(current);
- spin_unlock_irq(&runqueue_lock);
+ current->need_resched = 1;
+ move_last_runqueue(current);
+ spin_unlock_irq(&runqueue_lock);
+ }
return 0;
}

@@ -1664,13 +1702,25 @@ asmlinkage long sys_sched_get_priority_m

asmlinkage long sys_sched_rr_get_interval(pid_t pid, struct timespec *interval)
{
- struct timespec t;
+ struct timespec t;
+ struct task_struct *p;
+ int retval;

- t.tv_sec = 0;
- t.tv_nsec = 150000;
- if (copy_to_user(interval, &t, sizeof(struct timespec)))
- return -EFAULT;
- return 0;
+ if (pid < 0)
+ return -EINVAL;
+
+ retval = -ESRCH;
+
+ read_lock(&tasklist_lock);
+ p = find_process_by_pid(pid);
+ if (p)
+ jiffies_to_timespec(p->priority, &t);
+ read_unlock(&tasklist_lock);
+
+ if (p)
+ retval = copy_to_user(interval, &t, sizeof(struct timespec)) ? -EFAULT : 0;
+
+ return retval;
}

asmlinkage long sys_nanosleep(struct timespec *rqtp, struct timespec *rmtp)
@@ -1686,7 +1736,7 @@ asmlinkage long sys_nanosleep(struct tim


if (t.tv_sec == 0 && t.tv_nsec <= 2000000L &&
- current->policy != SCHED_OTHER)
+ (current->policy & (SCHED_FIFO|SCHED_RR)))
{
/*
* Short delay requests up to 2 ms will be handled with
@@ -1694,10 +1744,18 @@ asmlinkage long sys_nanosleep(struct tim
*
* Its important on SMP not to do this holding locks.
*/
- udelay((t.tv_nsec + 999) / 1000);
+ for ( expire=t.tv_nsec; ((long)expire)>0; expire-=10000 )
+ {
+ if ( current->need_resched ) /* there may be a higher priority */
+ { /* RT thread waiting... */
+ t.tv_nsec = expire; /* substract the time we've slept */
+ goto schedule; /* let the scheduler to its job */
+ }
+ udelay(10); /* busy loop for ~10us (10000ns) */
+ }
return 0;
}
-
+schedule:
expire = timespec_to_jiffies(&t) + (t.tv_sec || t.tv_nsec);

current->state = TASK_INTERRUPTIBLE;
@@ -1830,13 +1888,9 @@ void __init sched_init(void)
* We have to do a little magic to get the first
* process right in SMP mode.
*/
- int cpu=hard_smp_processor_id();
- int nr;
-
- init_task.processor=cpu;
+ init_task.processor = hard_smp_processor_id();

- for(nr = 0; nr < PIDHASH_SZ; nr++)
- pidhash[nr] = NULL;
+ memset( pidhash, 0, PIDHASH_SZ*sizeof(pidhash[0])); /* clear pidhash[] */

init_bh(TIMER_BH, timer_bh);
init_bh(TQUEUE_BH, tqueue_bh);
diff -urNp /img/linux-2.3.21/mm/page_alloc.c linux-2.3.21s1/mm/page_alloc.c
--- /img/linux-2.3.21/mm/page_alloc.c Wed Sep 1 20:34:29 1999
+++ linux-2.3.21s1/mm/page_alloc.c Wed Oct 13 00:51:49 1999
@@ -314,10 +314,8 @@ ok_to_allocate:
* We may be a real-time process, and if kswapd is
* waiting for us we need to allow it to run a bit.
*/
- if (gfp_mask & __GFP_WAIT) {
- current->policy |= SCHED_YIELD;
- schedule();
- }
+ if (gfp_mask & __GFP_WAIT)
+ schedule_others();

nopage:
return 0;
diff -urNp /img/linux-2.3.21/net/socket.c linux-2.3.21s1/net/socket.c
--- /img/linux-2.3.21/net/socket.c Tue Oct 12 01:47:23 1999
+++ linux-2.3.21s1/net/socket.c Wed Oct 13 00:51:49 1999
@@ -141,8 +141,7 @@ static void net_family_write_lock(void)
while (atomic_read(&net_family_lockct) != 0) {
spin_unlock(&net_family_lock);

- current->policy |= SCHED_YIELD;
- schedule();
+ schedule_others();

spin_lock(&net_family_lock);
}
diff -urNp /img/linux-2.3.21/net/unix/af_unix.c linux-2.3.21s1/net/unix/af_unix.c
--- /img/linux-2.3.21/net/unix/af_unix.c Tue Oct 12 01:47:23 1999
+++ linux-2.3.21s1/net/unix/af_unix.c Wed Oct 13 00:51:07 1999
@@ -542,10 +542,8 @@ retry:
addr->hash)) {
write_unlock(&unix_table_lock);
/* Sanity yield. It is unusual case, but yet... */
- if (!(ordernum&0xFF)) {
- current->policy |= SCHED_YIELD;
- schedule();
- }
+ if (!(ordernum&0xFF))
+ schedule_others();
goto retry;
}
addr->hash ^= sk->type;

--------------221648906DCC76366B22FDB2--

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/