Re: Various problems on Axis 700 Lite VIA C7

From: Michael Tokarev
Date: Fri Oct 05 2007 - 16:48:12 EST


Guennadi Liakhovetski wrote:
> Hi
>
> Ok, after a day of biseting, it turns out to be a compiler problem. The
> gcc-3.3.5 produces at least these two problems (Oops on i2c-viapro probe
> and disabled IRQs in USB), whereas 4.1.2 has no problem so far. Up to now
> 3.3.5 had no problem compiling 2.6.20+ kernels here, for example, for P-II
> SMP. Does it at all look realistic that such "random" run-time problems
> are caused by a miscompilation?...

I can't say about compilers (but it looks to me somewhat possible still),
but I can say a bit about the platform/CPU. You can find my thread titled
"VIA C7 anyone" from several months back in archives - that was my first
expirience. Since that, I received several emails from others with similar
problems.

The things is that at least boards I'm using, but I suspect it's CPU not
the board, -- are somewhat... flaky, so to say, and their reliability (or
even ability to work) depends on several factors, starting with production
conditions (environment at a time when it has been produced) and up to
various thermal factors.

It seems there's quite significant percentage of C7-based boards that are
flaky/unreliable, replacing one with another from the same batch usually
fixes the prob.

Next, some boards are VERY sensitive to themperature, and their thermal
sensors are WRONG - it seems - 100% of the time, showing ~20% less
themperature than it really is (say, when the sensor shows 35 degrees
celsius, the themp really is about 45..50 degrees) -- when the themperature
(on SOME samples) grows above 40 degrees, the system becomes unreliable
and may crash randomly here and there.

More, due to geometry of the CPU chip, with very small square area that
touches the headsink and relatively large headsink, it's sometimes enouth
to just touch the headsink so it positions wrongly, with bad thermo-contact
between the CPU and the headsink, resulting in high themperatures and
system instability.

And even more interesting -- it seems that some sequence of instructions
are more frequently misinterpreted (under "abnormal" conditions above)
than other sequences doing the same thing. That is, the same program
compiled with gcc-3.4 may work almost 100% correct while the same thing
compiled with gcc-4.1 may almost alway fail (usually due to segmentation
fault), or exactly the opposite.

So umm.... ;)

I'm running VIA C7 on this machine where I'm typing right now - no single
glitch since the time I replaced the failing one (at a time of mentioned
thread), it's rock-solid. I even removed the fan from the headsink - the
themp grows up to 70..80 degrees celsius (according to lm-sensors, so actual
themperature should be higher) when I run CPU-intensive apps, and nothing
breaks. Previous motherboard which I replaced (the same model, from the
same batch) was breaking randomly, but worked relatively stable when
placed on the street and with a large "external" fan used in rooms for
ventilation... ;) even without any load when onboard sensors showed 35
degrees.

That to say - it may be some miscompilations, but may be some probs with
hardware itself. If you can, try to reproduce the same on another board
(I just tried to boot 2.6.23-rc5 on this machine, compiled for PIII CPU
using gcc-4.1.2 (no other version installed, sorry) - no issues so far).

And no, I'm not trying to say "don't use ViA C7" etc -- I just love this
my box, it's very good, powerful enouth and quiet. Just be prepared for
some... issues, which happens - rarely but still.

/mjt

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/