Possible race condition in i386 global_irq_lock handling.

From: TeJun Huh
Date: Thu Aug 21 2003 - 03:48:26 EST


I've been reading i386 interrupt handling code for a couple of days
and encountered something that looks like a race condition. It's
between include/asm-i386/hardirq.h:irq_enter() and
arch/i386/kernel/irq.c:get_irqlock(). They seem to be using lockless
synchronization with local_irq_count of each cpu and global_irq_lock
variable.

A. locking CPU

1. Do test_and_set_bit() on global_irq_lock, if fail, repeat.
2. If all local_irq_count's are zero, we're the winner. Check other
stuff; otherwise, clear global_irq_lock and retry.

B. other CPUs

1. Increment local_irq_count
2. test_bit() on global_irq_lock, if zero, continue handling interrupt;
otherwise, wait till it's cleared.

For this to work, the locking CPU should fetch the value of
local_irq_count after global_irq_lock value becomes visible to other
CPUs, and other CPUs should fetch the value of global_irq_lock after
making the incremented local_irq_count visible to other CPUs.

The locking CPU is OK because test_and_set_bit() forces ordering on
x86, but there should be a mb() betweewn step 1 and 2 for other CPUs
because none of ++ and test_bit is ordering. The B part is irq_enter()
in hardirq.h which looks like the following.

static inline void irq_enter(int cpu, int irq)
{
++local_irq_count(cpu);

while (test_bit(0,&global_irq_lock)) {
cpu_relax();
}
}

Is it a race condition or am I getting it horribly wrong? Thx in
advance.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/