Re: test10-pre1 problems on 4-way SuperServer8050

From: Tigran Aivazian (tigran@veritas.com)
Date: Wed Oct 11 2000 - 14:44:06 EST


On Wed, 11 Oct 2000, Rik van Riel wrote:

> On Wed, 11 Oct 2000, Tigran Aivazian wrote:
> > On Wed, 11 Oct 2000, Mark Hemment wrote:
> > > On Wed, 11 Oct 2000, Tigran Aivazian wrote:
> > >
> > > > a) one of the eepro100 interfaces (the onboard one on the S2QR6 mb) is
> > > > malfunctioning, interrupts are generated but no traffic gets through (YES,
> > > > I did plug it in correctly, this time, and I repeat 2.2.16 works!)
> > >
> > > I saw this the other week on our two-way Dell under a reasonibly heavy
> > > load - but with 3c59x.c driver, the eepro100s survived!
> > > Either NIC (had two Tornados) could go this away after anything from 1
> > > to 36 hours of load. They would end up running in "poll" mode off the
> > > transmit watchdog timer.
> > > Swapped them for a dual-port eepro100 and no more problems.
> >
> > I disabled eepro100 support completely and the problem is still
> > there. What I also noticed is that with highmem-PAE enabled I
> > get BUG in page_alloc.c at line 221 so it is probably a VM
> > problem recently introduced (hence cc'd Rik).
>
> Can you trigger this bug /without/ PAE ?
>

no, I can't.

> I've been stress-testing my dual-cpu test machines (one with
> 64MB and one with 1GB ram) very very heavily for the last 4
> days and haven't encountered any bug whatsoever ...
>
> Btw, what compiler are you using ?

kgcc on red hat 6.9 which is really egcs-2.91.66

>
> > I will continue to narrow down by removing some things (like
> > mtrr) from the equation. Rik, the problem is that when one
> > enables PAE (or just highmem-4G) support on a 4-way 6G RAM
> > machine becomes 38-40 times slower.
>
> 38-40 times slower in what kind of benchmark ?

compiling the kernel, specifically. But even a simple thing like "time
ps" shows about 0.9 seconds real time when it should show something like
0.021 seconds. Everything becomes unbearably slow, make xconfig takes 4-5
minutes to startup, the shutdown becomes impossible so I do sysrq-B after
a few syncs etc. Also, as I said, one of the eepro100 interfaces becomes
dead. I believe this _is_ the same problem even if it really seems it is
not.

Regards,
Tigran

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Oct 15 2000 - 21:00:19 EST