I've noticed a handful of messages in recent months regarding the problems
with the via chipset timer. It would appear that the timer fails every
so often and this causes gettimeofday to start returning weird values.
This has the following symptoms that I've noticed:
- clock often jumps forward 71 minutes, then back
- screensaver kicks on unexpectedly
- video playback programs freeze or start stuttering
- PS/2 mouse flies up to upper right corner under X
... I'm sure there's more; odd timeofday values cause lots of strange
There are a few patches floating around that fix this in some cases, but
not all. I've looked into this further and created a patch that I think
does a much better job, though it may not be perfect yet.
In 2.4.18, whenever the code sees the microsecond offset start to grow too
large, it guesses that there's a timer problem and smacks the timer. This
seems to work, but I think the code is in the wrong place.
This workaround only happens if CONFIG_X86_TSC is not set.
Athlon-optimized kenels seem likely to have CONFIG_X86_TSC set (the
redhat athlon kernel does), so it seems wrong to put the workaround there.
Additionally, there's a while loop in do_gettimeofday() that will loop
millions of times if an unreasonable offset is returned from
do_gettimeoffset(). This can be avoided by doing division instead.
I've worked over the code a bit, and I have a new patch that moves the
timer-smack into the part of the code that executes whether the TSC is
being used or not. If you don't like the amount of code I've moved
around, fear not: most of the code shuffling is just to make the debugging
printk print the data I want. It should be straightforward to make a
smaller patch that does the same thing.
In my testing (using CONFIG_X86_TSC) this improves the situation quite a
bit: before, the timer would stay messed up and the machine would act
crazy until the next reboot. Now, there may be a single bad value
returned but the system goes back to normal after that. Maybe not
perfect, but certainly better.
I'd appreciate it if anyone experiencing odd behavior on Via chipsets
could give it a try. The problem usually only occurs under heavy loads; I
have reproduced it often by creating massive images (5000x5000 pixels) in
The Gimp or playing MPEG files while copying huge files around.
The patch works well today, but there are still a few outstanding
questions I have:
1. Why does this (bogus offset) happen? Has the timer died? Is there
another way to prevent this from happening in the first place?
2. Is it possible to resurrect what the correct offset should be at this
3. If not, what's the best value to use as an offset here? I'm still
using the bogus value to calculate the timeofday returned. Is there a
4. What does the code (which I've named smack_timer) do? It is correct or
just lucky? I kept the workaround code that was already in 2.4.18, but I
don't understand what it's doing.
Patch attached applies against 2.4.18 and the redhat 7.3 kernels. I'll
keep my latest version here:
This archive was generated by hypermail 2b29 : Thu May 23 2002 - 22:00:27 EST