typecast bug in sched.c bites reschedule_idle on alpha

From: Andy Isaacson (adi@hexapodia.org)
Date: Thu Jul 17 2003 - 16:51:39 EST


On SMP Alpha, a saturated system (with one busy userland process per
CPU) becomes almost completely unusable. The busy loops don't have to
be doing anything more complicated than "while(1) ;". The symptom is
that even logging into the system (with ssh) takes on the order of 30
seconds. (Tested on an ES40 with 4 processors, 2.4.18.)

It turns out that the problem is in kernel/schedule.c:reschedule_idle.

        cycles_t oldest_idle;
...
        oldest_idle = (cycles_t) -1;
...
                        if (oldest_idle == -1ULL) {

Since asm-alpha/timex.h defines cycles_t as unsigned int, this
comparison is always false. Changing it to (cycles_t)-1 fixes the
problem.

Patch below. I'd like this to go into 2.4.22, if nobody has a problem
with that. I would like confirmation from other eyes -- this patch
doesn't break semantics on any architecture, does it?

-andy

--- linux-2.4.21/kernel/sched.c Thu Jul 17 16:43:37 2003
+++ linux-2.4.21-idle-fix/kernel/sched.c Thu Jul 17 16:23:25 2003
@@ -282,7 +282,7 @@
                                 target_tsk = tsk;
                         }
                 } else {
- if (oldest_idle == -1ULL) {
+ if (oldest_idle == (cycles_t)-1) {
                                 int prio = preemption_goodness(tsk, p, cpu);
 
                                 if (prio > max_prio) {
@@ -294,7 +294,7 @@
         }
         tsk = target_tsk;
         if (tsk) {
- if (oldest_idle != -1ULL) {
+ if (oldest_idle != (cycles_t)-1) {
                         best_cpu = tsk->processor;
                         goto send_now_idle;
                 }
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Wed Jul 23 2003 - 22:00:31 EST