Re: Jiffy based timers/timeouts can expire too soon.

From: George Anzinger
Date: Thu Dec 16 2004 - 15:43:58 EST


Anton Blanchard wrote:


Well, hopefully the lost tick detection code won't over compensate, so
it shouldn't be an issue. However, as Tim Mann pointed out it, due to
interrupt delay and queuing, it is seen on virtualized systems.


We saw this on ppc64 on earlier 2.6 kernels. There were some bugs with
the VM where interrupts would get disabled for a long time (we saw 20+
second periods). A SCSI timeout would occur on another CPU and at that
time irqs would get reenabled and 20 seconds of time would get replayed.

A bunch of timers would go off early and the SCSI adapter would explode.

The problem is that "most" code believes jiffies is right. Under long interrupt off times, it is not. I suspect that most of the early timers came from code that set the timer with the interrupt system off. Some might say they got what they deserved :).

In the HRT patch, we always correct jiffies to the real value (by marking the TSC value at the last jiffie push and using that plus the current TSC to correct). It would be rather easy to provide an interface to get the current real current jiffie, but it is another thing to correct all the code that uses jiffie. Attempts to make jiffie a macro pick up far too many uses of the word in several name spaces to make it a reasonable thing to do.
-
George Anzinger george@xxxxxxxxxx
High-res-timers: http://sourceforge.net/projects/high-res-timers/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/