Re: Potential 2.2.8 scheduler bugs

Andrea Arcangeli (andrea@suse.de)
Thu, 13 May 1999 23:40:59 +0200 (CEST)


On Thu, 13 May 1999, Ingo Molnar wrote:

>> On another front, release() in exit.c contains the following piece of code:
>>
>> for (;;) {
>> int has_cpu;
>> spin_lock_irq(&runqueue_lock);
>> has_cpu = p->has_cpu;
>> spin_unlock_irq(&runqueue_lock);
>> if (!has_cpu)
>> break;
>> do {
>> barrier();
>> } while (p->has_cpu);
>> }
>
>yes this is historical code, the outer loop is not needed anymore.

I think that this race will never trigger but according to me rmb() may be
still desiderable. We must make sure to read p->has_cpu after p->state.

Maybe our CPU reads p->has_cpu and see 0, before reading p->state and
seeing TASK_ZOMBIE, and between the two reads the other cpu scheduled the
task `p' and the task `p' exits and p->state get set to ZOMBIE but do_exit
has still to complete (has_cpu is 1 but the other CPU think it's 0
because it read has_cpu out of order).

And it's sure not needed in i386 because before calling release() there is
a read_unlock() that will issue a lock on the bus, but I think for other
archs a lock on the bus is not enough to flush the CPU-OOO queue.

The spinlock in the current code would have synchronized the reads as
rmb() can do.

for (;;) {
int has_cpu;
rmb()
has_cpu = p->has_cpu;
if (!has_cpu)
break;
do {
barrier();
} while (p->has_cpu);
}

Andrea Arcangeli

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/