linux-kernel-digest Thursday, 13 February 1997 Volume 01 : Number 746
In this issue:
Unresolved symbols in /lib/modules/.../*.o?
Behavior under swap catastrophe?
Keyboard hangs when using PS/2 mouse
Re: CONUNDRUM.
Re: 640MB MO patch
Re: ramdisk problem 2.1.2[56]
Re: IDE Disk Problems
Re: Sparc module char-major-14?
Re: IDE Disk Problems
Re: CONUNDRUM.
Re: IDE Disk Problems
Re: Unresolved symbols in /lib/modules/.../*.o?
PATCH for 2.1.26: newly forked processes killed by "handled signals"
Re: Big malloc's.
Re: 2.0.27 major problems #1 -- 3c59x driver.
hard disk drive status
Re: IDE Disk Problems
Re: ramdisk problem 2.1.2[56]
Re: CONUNDRUM.
RE: B*gg*r mallocs, mmap JOY!
Re: IDE Disk Problems
VFS/Posix question
Re: CONUNDRUM.
NFS problem with 2.0.25
MENUCONFIG errors
Re: IDE Disk Problems
Re: Performance patch for NE Ethernet
Re: MENUCONFIG errors
Re: 2.0.27 major problems #1 -- 3c59x driver.
[none]
Re: 640MB MO patch
Re: Behavior under swap catastrophe?
Re: IDE Disk Problems
Re: CONUNDRUM.
Linux & EISA bus ?
Re: Linux VM subsystem (Was: Big mallocs, mmap sorrows and double buffering.)
"Modules Oops" workaround for 2.1.26
Re: Version bug in 2.0.29?
Thanks, and another Question
Re: CONUNDRUM.
Re: CONUNDRUM.
Resyncs
Sony CDU33a+kernel>2.1.22=system hangs totally
insmod
Re: CONUNDRUM.
Re: Performance patch for NE Ethernet
Report on compiling 2.1.26
See the end of the digest for information on subscribing to the linux-kernel
or linux-kernel-digest mailing lists.
----------------------------------------------------------------------
From: fruviad <fruviad@coil.com>
Date: Wed, 12 Feb 1997 18:56:30 -0500 (EST)
Subject: Unresolved symbols in /lib/modules/.../*.o?
Hi,
Tried sending this to linux-newbie, but I haven't gotten an
answer from there. Thought I'd ask here.
I recompiled my kernel (2.0.18, Redhat) and now get a bunch of
complaints upon boot. Something along the lines of:
Finding module dependencies
Unresolved symbols in /lib/modules/2.0.18/yadda/yadda.o
where "yadda" corresponds to quite a few file/directory names.
I haven't added any patches to the kernel, so the version is the
same. I've tried doing a number of other things (make modules, make
modules_install)...without luck.
Anyone have any ideas what I'm missing?
Thanks...
Peter
ps...it seems that all the linux kernel docs I've found are either
extremely simplified (ie. saying "yes" here does this...), or
kernel-hacking. Anyone know of anything in-between for a non-programmer
who wants to get into programming/kernel stuff?
Thanks again...
- --
/\_/\ |
( o.o ) | fruviad@coil.com (Lizard Ho!!!)
> < |
------------------------------
From: jf@ugcs.caltech.edu (Joe Fouche)
Date: Wed, 12 Feb 1997 15:54:07 -0800
Subject: Behavior under swap catastrophe?
- --sldF0LQLVoqAiE7S
I've noticed lately that the behavior of the kernel when some process goes
berserk and fills up all the swap is a little strange. It seems to start sending
SEGV's to many processes as the large one grows. This wouldn't be so bad, except
that init is often killed. Is a modification to protect the life of init in order?
Or should we just make sure this never happens?
Comments or flames welcome.
- --
_ ____ Joe Fouche (jf@ugcs.caltech.edu)
___| |--- Deranged College Student
- --sldF0LQLVoqAiE7S
Content-Type: application/pgp-signature
- -----BEGIN PGP SIGNATURE-----
Version: 2.6.3i
iQB5AwUBMwJYHXJgYOdk+W8JAQEpxgMfXLZodW2/4Do8DaZADLuTCpZoJxSDvnG2
Mb45aDET2lxm1Gw1AgLlC0muCsIm0UdIEG036v/ez11Y0H3z9d/Gv1qnEdTHXdVI
R9RWOUYmwIjMe2OTC0GXxcW8TYEnSrugPCLhcw==
=e+4D
- -----END PGP SIGNATURE-----
- --sldF0LQLVoqAiE7S--
------------------------------
From: Benny Amorsen <amorsen@sscnet.com>
Date: 13 Feb 1997 00:26:20 +0000
Subject: Keyboard hangs when using PS/2 mouse
Using kernels 2.0.27 through 2.0.29, I experience keyboard hangs every
time I start gpm. /dev/mouse is a symlink to /dev/psaux, and I have
tried psaux both as a module and compiled into the kernel. In both
cases, the mouse detection does not hang the keyboard. It is only when
gpm starts that the keyboard hangs.
gpm is version 1.10. The mouse in question is a two-button Compaq PS/2
mouse. The motherboard is Asus P55T2P4, and the bios detects the
presence of a mouse. If the mouse is disconnected at boottime, the
bios switches the PS/2 interface off. The kernel detects the interface
fine, too, but as far as I can see the kernel just believes whatever
the BIOS tells it.
The keyboard is a good old no-name AT-keyboard.
Benny
------------------------------
From: Systemkennung Linux <linux@informatik.uni-koblenz.de>
Date: Thu, 13 Feb 1997 01:50:46 +0100 (MET)
Subject: Re: CONUNDRUM.
> > First guess Alexey. Linux crt0.s doesnt align the stack on an eight byte
> > boundary.
>
> Any reason why it shouldn't? Could this simple thing case a >10%
> performance drop for some applications? If so why wasn't this noticed and
> fixed long ago?
Alan's first guess makes sense in that the Fortran people have been
complaining about exactly that problem for quite some time.
Ralf
------------------------------
From: NIIBE Yutaka <gniibe@mri.co.jp>
Date: Thu, 13 Feb 1997 09:39:26 +0900
Subject: Re: 640MB MO patch
Shigehiro Nomura writes:
> The patch for 640MB MO written by Mr. Nagai has been released a few
> months ago, too :-)
Sigh. Shigehiro, please learn how people cooperate together. It
seems (at leaset for me) your attitude is not polite enough to Eric
and the Linux community. If you really read and learn how SCSI sub
system, iso9660 filesystem. or ELF system work in Linux, you must know
Eric Youngdale who contributes to Linux much much and much.
I think that we should learn/check how things are going on, before
proposing a patch. This makes the development process easy. IMHO, a
development of free software is not a race, but something like a folk
dance.
Well, I never think your effort to add feature is bad or wrong. Yes,
it is great thing itself, really, but the way you did is questionable.
It is better for our community to cooperate together, you know.
Thanks,
- --
NIIBE Yutaka
------------------------------
From: "Andrew E. Mileski" <aem@netcom.ca>
Date: Wed, 12 Feb 1997 20:07:23 -0500 (EST)
Subject: Re: ramdisk problem 2.1.2[56]
> When I attempt to build an ext2 filesystem on a ramdisk, mke2fs complains
> about not being able erase block 0. I am able to mount the file system
> but any file write operation on the mounted file system hangs. Is this a
> known problem? I am fairly sure that all of my pertinent libs etc. are at
> the correct levels. I see the same problem with 2.1.25 also.
It is a known problem, and the solution is to read the man page :-)
(The option you need to use is '-F')
- --
Andrew E. Mileski mailto:aem@netcom.ca
Linux Plug-and-Play Kernel Project http://www.redhat.com/linux-info/pnp/
XFree86 Matrox Team http://www.bf.rmit.edu.au/~ajv/xf86-matrox.html
------------------------------
From: "Andrew E. Mileski" <aem@netcom.ca>
Date: Wed, 12 Feb 1997 20:11:58 -0500 (EST)
Subject: Re: IDE Disk Problems
> I had the same problem on a 21000 (1Gb). Sent it back to the supplier,
> they replaced the disks software and did a new low level format. Had no
> trouble since. There have been problems with a few series of WD drives
> that require a replacement of the drive's software. You can download a fix
> from www.wdc.com
I can second this - sent the drive back to WD, and they sent me back
a new drive (WD has a 1 year warranty). I think the Caviar must have had
a bad run or two or three...anyways, works fine now, but they are hugely
more noisy than my Seagate Hawks (I've got a pair of each).
- --
Andrew E. Mileski mailto:aem@netcom.ca
Linux Plug-and-Play Kernel Project http://www.redhat.com/linux-info/pnp/
XFree86 Matrox Team http://www.bf.rmit.edu.au/~ajv/xf86-matrox.html
------------------------------
From: "David S. Miller" <davem@jenolan.rutgers.edu>
Date: Wed, 12 Feb 1997 20:21:30 -0500
Subject: Re: Sparc module char-major-14?
Date: Wed, 12 Feb 1997 14:41:31 -0800 (PST)
From: Trevor Johnson <trevor@blues.jpj.net>
Richard A Sahlender Jr wrote:
> modprobe: Can't locate module char-major-14
>
> Received this on a Sparc 2 yesterday.
I'm not sure these are correct but they work for me (on x86):
They are right, but the issue is that we do not have finished drivers
for the Sparc sound hardware as of yet, and thus the modules won't be
there anyways.
- ---------------------------------------------////
Yow! 11.26 MB/s remote host TCP bandwidth & ////
199 usec remote TCP latency over 100Mb/s ////
ethernet. Beat that! ////
- -----------------------------------------////__________ o
David S. Miller, davem@caip.rutgers.edu /_____________/ / // /_/ ><
------------------------------
From: JHazard <jdhazard@texas.net>
Date: Wed, 12 Feb 1997 19:57:16 -0600
Subject: Re: IDE Disk Problems
My company recently received around a 100 HPs with Western Digital 1.6s
in them. Just a guestimate, however, I believe that around 15-20% of
them have had drive problems withing the first two months. These are
mostly running NT, however the one in my area is running Linux and i
suspect it too is having problems. I've basically lost both of my
bootable Linux partions once (at diffrent times...) due to drive errors.
Now my WD 1.6 at home as never caused me a problem with any operating
system.
Andrew E. Mileski wrote:
>
> > I had the same problem on a 21000 (1Gb). Sent it back to the supplier,
> > they replaced the disks software and did a new low level format. Had no
> > trouble since. There have been problems with a few series of WD drives
> > that require a replacement of the drive's software. You can download a fix
> > from www.wdc.com
>
> I can second this - sent the drive back to WD, and they sent me back
> a new drive (WD has a 1 year warranty). I think the Caviar must have had
> a bad run or two or three...anyways, works fine now, but they are hugely
> more noisy than my Seagate Hawks (I've got a pair of each).
>
> --
> Andrew E. Mileski mailto:aem@netcom.ca
> Linux Plug-and-Play Kernel Project http://www.redhat.com/linux-info/pnp/
> XFree86 Matrox Team http://www.bf.rmit.edu.au/~ajv/xf86-matrox.html
------------------------------
From: Clive Messer <clive@epos.demon.co.uk>
Date: Thu, 13 Feb 1997 02:10:42 +0000 (GMT)
Subject: Re: CONUNDRUM.
On Thu, 13 Feb 1997, Systemkennung Linux wrote:
> > > First guess Alexey. Linux crt0.s doesnt align the stack on an eight byte
> > > boundary.
> >
> > Any reason why it shouldn't? Could this simple thing case a >10%
> > performance drop for some applications? If so why wasn't this noticed and
> > fixed long ago?
>
> Alan's first guess makes sense in that the Fortran people have been
> complaining about exactly that problem for quite some time.
I just noticed in HJ's libc Changelog, 5.4.21-5.4.22 .......
- --------------------------------------------------------------------
Wed Jan 29 20:50:53 1997 H.J. Lu (hjl@gnu.ai.mit.edu)
* sysdeps/linux/i386/crt/crt0.S: align stack to 8 bytes.
- --------------------------------------------------------------------
Clive.
- --
C Messer
<clive@epos.demon.co.uk> | "I pressed her thigh and death smiled."
<clive@epos.easynet.co.uk> | Jim Morrison.
------------------------------
From: Systemkennung Linux <linux@informatik.uni-koblenz.de>
Date: Thu, 13 Feb 1997 03:19:50 +0100 (MET)
Subject: Re: IDE Disk Problems
> My company recently received around a 100 HPs with Western Digital 1.6s
> in them. Just a guestimate, however, I believe that around 15-20% of
> them have had drive problems withing the first two months. These are
> mostly running NT, however the one in my area is running Linux and i
> suspect it too is having problems. I've basically lost both of my
> bootable Linux partions once (at diffrent times...) due to drive errors.
We've got seven IDE WD disks during the last half year. That makes more
than 60% failure rate. Some of these were apparently killed by the
Firmware/BIOS problem in combination with Asus boards. The three
1.6GB disks were not used with the affected Asus boards. One of them
ruined my last weekend, so goodbye, WD.
During the same time span no other disk of another brand failed, neither
SCSI nor IDE.
Ralf
------------------------------
From: amu@mit.edu (Aaron M. Ucko)
Date: 12 Feb 1997 21:41:47 -0500
Subject: Re: Unresolved symbols in /lib/modules/.../*.o?
fruviad <fruviad@coil.com> writes:
> I recompiled my kernel (2.0.18, Redhat) and now get a bunch of
> complaints upon boot. Something along the lines of:
>
> Finding module dependencies
> Unresolved symbols in /lib/modules/2.0.18/yadda/yadda.o
>
> where "yadda" corresponds to quite a few file/directory names.
You can ignore these messages. Red Hat installs modules for a *lot*
of drivers so that users are less likely to have to rebuild their
kernel. You are probably seeing the "Unresolved symbols" message
because those particular modules require symbols found in files which
Red Hat compiled into their kernel but you neither compiled into yours
or built as modules. (For instance, you'll get messages for
everything in /lib/modules/2.0.18/scsi if you did not enable SCSI
support when you built your kernel.)
- --
Aaron M. Ucko (amu@mit.edu) | For Geek Code, PGP public key, and other info,
finger amu@monk.mit.edu. | "Kids! Bringing about Armageddon can be dangerous.
Do not attempt it in your home." -- T. Pratchett & N. Gaiman, _Good Omens_
------------------------------
From: buhr@stat.wisc.edu (Kevin Buhr)
Date: 12 Feb 1997 23:24:48 -0600
Subject: PATCH for 2.1.26: newly forked processes killed by "handled signals"
- --Multipart_Wed_Feb_12_23:24:48_1997-1
Content-Type: text/plain; charset=US-ASCII
Alan:
After your patch fixed the "socketpair" behaviour, I got "dump"
working and managed to discover another 2.1.26 kernel bug. The
changes to "sys_sigreturn" in "arch/i386/kernel/signal.c" that added
more rigorous checking of segment register values---I don't know when
they were added---have the unintended side effect of causing crashes
in certain very rare circumstances.
Basically, the "copy_thread" function of "arch/i386/kernel/signal.c"
initializes the TSS's %gs with KERNEL_DS. If the newly forked child
is signaled immediately *and* if the child has a handler for that
signal, then the pseudobogus %gs value will be saved by "setup_frame"
(in "arch/i386/kernel/signal.c") and, when returning from the handler,
the "GET_SEG(gs)" in "sys_sigreturn" will barf and "do_exit(SIGSEGV)"
the child.
For reasons I still don't fully understand, if the child is allowed to
return from the "fork" call and begin execution before receiving the
signal, the %gs register is automagically "fixed" (but I can't figure
out where), and we never notice the problem. The enclosed
"signaltest.c" illustrates the bug: on a vanilla 2.1.26 kernel, the
child quiety dies immediately after handling the signal. A hacked up
kernel verifies that "GET_SEG(gs)" is the culprit.
The enclosed patch appears to fix the problem by having "copy_thread"
initialize %gs with "USER_DS" instead, as is done in other, similar
contexts. Does this break anything else?
Kevin <buhr@stat.wisc.edu>
- --Multipart_Wed_Feb_12_23:24:48_1997-1
Content-Type: application/octet-stream
Content-Disposition: attachment; filename="signaltest.c"
Content-Transfer-Encoding: 7bit
#include <stdio.h>
#include <sys/types.h>
#include <signal.h>
void
handler()
{
fprintf(stderr, "signal handled!\n");
}
main()
{
int child, rc;
signal(SIGUSR2, handler);
child = fork();
if (child == -1) {
perror("fork");
exit(1);
}
if (child == 0) {
fprintf(stderr, "before sleep\n");
sleep(30);
fprintf(stderr, "after sleep\n");
exit(0);
}
/***
* uncomment the next line, and the child will survive
***/
/* sleep(1); */
kill(child, SIGUSR2);
sleep(5);
wait(&rc);
printf("Status = %d\n", rc);
}
- --Multipart_Wed_Feb_12_23:24:48_1997-1
Content-Type: application/octet-stream
Content-Disposition: attachment; filename="patch-2.1.26-signalbug"
Content-Transfer-Encoding: 7bit
- --- linux/arch/i386/kernel/process.c 1997/02/13 04:20:30 1.1
+++ linux/arch/i386/kernel/process.c 1997/02/13 04:20:44 1.2
@@ -486,7 +486,7 @@
p->tss.ss = KERNEL_DS;
p->tss.ds = KERNEL_DS;
p->tss.fs = USER_DS;
- - p->tss.gs = KERNEL_DS;
+ p->tss.gs = USER_DS;
p->tss.ss0 = KERNEL_DS;
p->tss.esp0 = p->kernel_stack_page + PAGE_SIZE;
p->tss.tr = _TSS(nr);
- --Multipart_Wed_Feb_12_23:24:48_1997-1--
------------------------------
From: jrs@foliage.com (J. Richard Sladkey)
Date: 13 Feb 1997 05:55:32 +0000
Subject: Re: Big malloc's.
John Carter <john@dwaf-hri.pwv.gov.za> writes:
> How do I write a program that will ask the kernel how much free
> physical memory there is available to me without causing a lot of
> swapping?
In "the old days" experiments showed that the syscall overhead was
negligibly different for 8k I/O buffers vs. 64k buffers. Nowadays you
can just use 1M buffers and assume the system can deal with it. Can
you really show that operating on 8M buffers gives you noticably
better performance than 1M buffers?
------------------------------
From: Cameron MacKinnon <mackin@interlog.com>
Date: Thu, 13 Feb 1997 01:19:16 -0500
Subject: Re: 2.0.27 major problems #1 -- 3c59x driver.
> From: Chris Evans <chris@ferret.lmh.ox.ac.uk>
> On Wed, 12 Feb 1997, Philip Blundell wrote:
> > A transmitter access conflict is not disaster. There is no need to
> > reinitialise the controller - all it means is that the driver's
> > transmit routine was reentered, and the second transmit was deferred to
> > avoid contention.
>
> I am forced to disagree -- when your card hangs it certainly _is_ a
> disaster. Additionally, the code implies that that if execution reaches
> this stage it is a disaster anyway; quote "if this ever happens then the
> queue layer is doing something evil"
NOT being an expert in the Linux networking code, a few disinterested
observations:
- - Maybe the evil IS in the queue layer, and others haven't noticed as
their ethernet performance isn't as stellar as yours. Do the errors
occur randomly, or only under high load?
- - Is there any way of a) dumping the stack and freezing when the error
occurs, so as to analyze the state of the kernel that led to the error
(easy, just write it 8-), b) writing a special return code when this
occurs so that succeeding higher layers of network code can dump all
appropriate state (see answer to a above) c) disabling all except disk
interrupts and writing a kernel or entire machine core image to swap
space when this occurs?
Maybe I've missed some information on this thread, but the information
I've seen so far "Somewhere at or after reaching <vaguely defined state
x> my machine hangs" doesn't give a potential debugger much to go on.
How many printk()s have been added to the code so far in an attempt to
understand what's going on? Calls to a function to dump state? Rather
than wasting time arguing whether it's a problem or not, the affected
user should endeavour to provide as much information as possible. This
may involve kernel modifications, hired help, packet sniffers, in
circuit emulators, experimentation with different hardware, voodoo, dead
poultry and inconvenience to users. On the other hand, if he finds it
more cost effective to replace the offending hardware and/or OS, so be
it.
My sincerest apologies if I've missed something relevant.
------------------------------
From: "e9018967@stud2.tuwien.ac.at" <e9018967@stud2.tuwien.ac.at>
Date: Thu, 13 Feb 1997 00:24:35 +0100 (MET)
Subject: hard disk drive status
hello all!
as i'm working on a strategy to save power on laptops by switching off the hard disk
drive, i would like to know if there is a status bit on a ide hd which tells whether
the motor of the drive is running or not.
i know of hdparm, but i want to do other stuff, too.
any suggestions will be appreciated.
gernot
- --
gernot kerschbaumer, student of computer science at the technical university of vienna
email: e9018967@stud2.tuwien.ac.at
homepage: http://stud2.tuwien.ac.at/~e9018967/
------------------------------
From: Christian Rienecker <RIENECKE@vxdesy.desy.de>
Date: Thu, 13 Feb 1997 09:28:24 +0100
Subject: Re: IDE Disk Problems
On Wed, 12 Feb 1997 travis@zeus.rwii.com wrote:
>
> Mime-Version: 1.0
> Content-Type: text/plain; charset=us-ascii
> Date: Wed, 12 Feb 1997 18:42:40 -0500
> From: Travis Woodbury <travis@rwii.com>
>
>
> > [snip]
> > > Again, on the performance side, I get upwards of 25MB/s on either of my
> > > Seagate UltraWide scsi devices; a measurement that you are very unlikely
> > > to see on _any_ IDE drive (note: this is while the system is under
> > > considerable load, too).
> > >
> >
> > How did you measure that 25MB/s ? My 2 Maxtor IDE 2.0G's give me an
> > 'hdparm -T' value (cached xfer) of 32.0MB/s on a Micronics PCI/IDE
> > M54HI-Plus with a iP200 cpu. (My 'hdparm -t' >4.0 ie non-cached xfer) :-)
>
> Whats the cached transfer rate matter. Its the overall throughput that
> matters.
>
> With a Buslogic BT-958 and a Seagate Barracuda (ST32171W) I get just over 8.5
> MB/s
> with hdparm...no, thats not the cached rate. Unfortunately it didnt come cheap
>
> BT-958 $241
> ST32171W $695
>
> IDE seems to be the right choice for the everday workstation. I get great
> throughput
> with the Quantum Fireballs 6.50 MB/s. $252
>
> But for a small server or heavily used work station an Ultra Wide bus of fast
> drives
> kicks ass.
>
> One of these days I'll have find the time to stripe two of those ST32171W's:)
>
> Not that this affects kernel development:)
>
> -Travis Woodbury
>
>
>
Could it be that you are confusing MBytes with MBits ?
'Cause even with PIO mode 4 an EIDE bus cannot exceed 16 MBytes/sec
regardless of where the data comes from.
Furthermore, according to my sources there is no no SCSI HD which
exceeds 10 MBytes/sec real transfer (fram disk not cache) despite
having the theoretical limit of 40 MBytes/sec for an UltraWide controller.
CR.
-----------------------------------------------------------
| Christian Rienecker Tel. : +49 40 8998-2926 (Office) |
| HASYLAB at DESY -2368 (Lab) |
| Notkestr. 85 Fax. : +49 40 8998-2987 |
| 22603 Hamburg E-mail (Internet) |
| Germany : RIENECKE@desy.de |
-----------------------------------------------------------
------------------------------
From: koenig@tat.physik.uni-tuebingen.de (Harald Koenig)
Date: Thu, 13 Feb 1997 08:58:38 +0100 (MET)
Subject: Re: ramdisk problem 2.1.2[56]
> > When I attempt to build an ext2 filesystem on a ramdisk, mke2fs complains
> > about not being able erase block 0. I am able to mount the file system
> > but any file write operation on the mounted file system hangs. Is this a
> > known problem? I am fairly sure that all of my pertinent libs etc. are at
> > the correct levels. I see the same problem with 2.1.25 also.
>
> It is a known problem, and the solution is to read the man page :-)
> (The option you need to use is '-F')
-F Force mke2fs to run, even if the specified device
is not a block special device.
hmm, since when are ramdisks no longer block special devices ?
I never need "-F" for "mke2fs /dev/ram0" ...
Harald
- --
All SCSI disks will from now on ___ _____
be required to send an email notice 0--,| /OOOOOOO\
24 hours prior to complete hardware failure! <_/ / /OOOOOOOOOOO\
\ \/OOOOOOOOOOOOOOO\
\ OOOOOOOOOOOOOOOOO|//
Harald Koenig, \/\/\/\/\/\/\/\/\/
Inst.f.Theoret.Astrophysik // / \\ \
koenig@tat.physik.uni-tuebingen.de ^^^^^ ^^^^^
------------------------------
From: Bernd Schmidt <crux@Pool.Informatik.RWTH-Aachen.DE>
Date: Thu, 13 Feb 1997 10:37:07 +0100 (MET)
Subject: Re: CONUNDRUM.
On Thu, 13 Feb 1997, Clive Messer wrote:
> On Thu, 13 Feb 1997, Systemkennung Linux wrote:
>
> > > > First guess Alexey. Linux crt0.s doesnt align the stack on an eight byte
> > > > boundary.
> > >
> > > Any reason why it shouldn't? Could this simple thing case a >10%
> > > performance drop for some applications? If so why wasn't this noticed and
> > > fixed long ago?
> >
> > Alan's first guess makes sense in that the Fortran people have been
> > complaining about exactly that problem for quite some time.
>
> I just noticed in HJ's libc Changelog, 5.4.21-5.4.22 .......
>
> * sysdeps/linux/i386/crt/crt0.S: align stack to 8 bytes.
... which won't help you because the code that GCC generates will happily
break the alignment on all possible occasions (function calls, function entry)
The latest pgcc snapshot has an option "-mstack-align-double" which tries to
keep the stack alignment at 64 bits throughout the whole program. There are
some problems with that option yet, I think they'll be sorted out in the next
version.
Bernd
------------------------------
From: John Carter <john@dwaf-hri.pwv.gov.za>
Date: Thu, 13 Feb 1997 12:41:14 +0200 (SAT)
Subject: RE: B*gg*r mallocs, mmap JOY!
Ha! I just _luv_ Linux.
Thanks to Ingo Molnar and many others on the list who gave me clues
along they way...
Here is an example program for anyone else who wishes to use
mmap. (Note you _can_ use mmap to write to files, the trick is
ftruncate())
It does what 'cp' does, but a little bit slower. A cute thing to note
is the load graph soars to 3, maybe it would be civilized to 'nice'
any programs you run that do this.
The joy of it is that if you are actually doing real work like
neighbourhood operations on a satellite image, you can forget all
those nasty horrid boundary conditions on the buffers, as there are no
buffers (visible).
50% of the code and 90% of the bugs of such a program is typically
handling boundary conditions on the buffering.
I LOVE LINUX. I tested this creating 150Mb of VM on a 24Mb real mem
system and used 0 swap space.
By gorrah, what a mess programming in DOS and Vax/VMS was in
comparison.
John Carter EMail: ece@dwaf-hri.pwv.gov.za
Telephone : 27-12-808-0374x194 Fax:- 27-12-808-0338
Founder of the Council for Unnatural Scientists.
======================================================================
#include <sys/mman.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/stat.h>
#include <stdio.h>
#include <iostream.h>
main()
{
int ifd, ofd;
struct stat st;
char * icp, * ocp;
ifd = open( "/home/john/pgm/tryout/temp", O_RDONLY);
if( ifd < 0)
{
perror( "input file");
return 1;
}
fstat( ifd, &st);
cout << "Open " << st.st_size << " byte input file.\n";
ofd = open( "/home/john/pgm/tryout/temp1", O_CREAT | O_RDWR, S_IRWXU);
if( ofd < 0)
{
perror( "output file");
return 1;
}
cout << "Opened output file, now truncating file at " <<
st.st_size / 1024 / 1024 << " Mb's.\n";
ftruncate( ofd, st.st_size); // Note this cuty...
// ftruncate is not in glibc, its a system call, and works instantaneously.
cout << "Truncated file.\n";
icp = (char *)mmap( 0, st.st_size,
PROT_READ, MAP_FILE | MAP_SHARED, ifd, 0);
if( icp == (char *)-1)
{
perror( "input memory map");
return 1;
}
cout << "Mapped input file.\n";
ocp = (char *)mmap( 0, st.st_size,
PROT_WRITE, MAP_FILE | MAP_SHARED, ofd, 0);
if( ocp == (char *)-1)
{
perror( "output memory map");
return 1;
}
cout << "Mapped output file.\n";
memcpy( ocp, icp, st.st_size); // Shove data across, this is where
// the real work would go.
cout << "Done the mem copy.\n";
munmap( icp, st.st_size);
cout << "Unmapped input array.\n";
munmap( ocp, st.st_size);
cout << "Unmapped output array.\n";
if( close( ifd) < 0)
{
perror( "close input file");
return 1;
}
cout << "Closed input file.\n";
if( close( ofd) < 0)
{
perror( "close output file");
return 1;
}
cout << "Closed output file.\n";
}
------------------------------
From: Hugo Van den Berg <hbe@tommie.cypres.nl>
Date: Thu, 13 Feb 1997 10:20:31 +0100 (MET)
Subject: Re: IDE Disk Problems
On Wed, 12 Feb 1997, Pavel Galynin wrote:
> I'm having trouble installing Linux and several people suggested it had
> something to do with my WD IDE drives and EZ-drive software.Is that
> true?
> Paul
The WD's only if they give you problems with other OS's as well, see the
rest of this thread for comments. EZ-drive may use som drive geometry
different form the standard, this could give you trouble.
Hugo
------------------------------
From: schoebel@informatik.uni-stuttgart.de (Thomas Schoebel-Theuer)
Date: 13 Feb 1997 11:16:41 GMT
Subject: VFS/Posix question
Hi,
I'm just doing some patches in the kernel for enabling omirr-support.
omirr (online mirror) will support *symmetric* online mirroring of filesystems
once it is available.
I have found the following curiosity: if open() is called with flags
O_CREAT | O_EXCL , it returns an error if the "file" already exists.
However, what is really implemented is to check whether _some_ inode
with the given name already exists. If the name exists in the form of
a symlink pointing in turn to a non-existant name, it will be an error, too.
However, I would expect that the non-existing name should be created
instead (without returning an error), as is the case with leaving out the
O_EXCL bit.
I don't have access to the Posix standards. Could anyone clarify what
should be the correct behaviour?
- -- Thomas
------------------------------
From: Ingo Molnar <mingo@pc5829.hil.siemens.at>
Date: Thu, 13 Feb 1997 12:25:32 +0100 (MET)
Subject: Re: CONUNDRUM.
On Thu, 13 Feb 1997, Bernd Schmidt wrote:
> The latest pgcc snapshot has an option "-mstack-align-double" which tries to
> keep the stack alignment at 64 bits throughout the whole program. There are
> some problems with that option yet, I think they'll be sorted out in the next
> version.
i guess this is enough until the compiler does it right:
#define double double __attribute__ ((aligned (8)))
- -- mingo
------------------------------
From: Ulrich Windl <windl@rkdvmhp1.dvm.klinik.uni-regensburg.de>
Date: Thu, 13 Feb 1997 12:32:09 +0100 (MEZ)
Subject: NFS problem with 2.0.25
When I compile within Emacs on a NFS mounted filesystem, I frequently get
the following error from make when I saved a source file immediately
before starting make:
make: *** File `TextArea.cc' has modification time in the future
If I re-run the compilation, it will succeed. The clocks of both, the
client and the server are synchronized within 50ms. I suspect the NFS
attribute cache being flushed on write, and the attributes on the server
are not yet updated, or is it just the other way round (attributes in the
cache are still old)?
Ulrich
------------------------------
From: Tom Olson <tjolson@voicenet.com>
Date: Sat, 08 Feb 1997 17:23:16 -0500
Subject: MENUCONFIG errors
I'm running Slackware 96 (2.0.20), I want to use the "old" method to
configure to configure sound into my kernel, but I seem to be only using
the "new" method. I read that I can select the old method using "make
menuconfig". When I run menuconfig I get this:
bash# make menuconfig
rm -f include/asm
( cd include ; ln -sf asm-i386 asm)
make -C scripts/lxdialog all
make[1]: Entering directory `/usr/src/linux-2.0.20/scripts/lxdialog'
gcc -O2 -Wall -fomit-frame-pointer -DLOCALE -DCURSES_LOC="<curses.h>"
- -c lxdialog.c -o lxdialog.o
In file included from lxdialog.c:22:
dialog.h:109: parse error before `use_colors'
dialog.h:109: warning: data definition has no type or storage class
dialog.h:110: parse error before `use_shadow'
dialog.h:110: warning: data definition has no type or storage class
dialog.h:112: parse error before `attributes'
dialog.h:112: warning: data definition has no type or storage class
dialog.h:125: parse error before `chtype'
dialog.h:130: parse error before `chtype'
make[1]: *** [lxdialog.o] Error 1
make[1]: Leaving directory `/usr/src/linux-2.0.20/scripts/lxdialog'
make: *** [menuconfig] Error 2
Can you help with this error?
Thanks,
Tom Olson
------------------------------
From: Hugo Van den Berg <hbe@cypres.nl>
Date: Thu, 13 Feb 1997 10:32:23 +0100 ()
Subject: Re: IDE Disk Problems
On Thu, 13 Feb 1997 bofh@snoopy.virtual.net.au wrote:
<snip>
>
> I have read a number of messages related to the quality of IDE drives
> and WD drives in particular and I believe that I have to respond to give
> the other side of the story.
> I have had what I consider to be a reasonable amount of sys-admin
> experience (including running an ISP for over a year). Currently I run
> 3 OS/2 servers, 5 Linux servers, 2 NT servers, and quite a few
> workstations. All the OS/2 and Linux machines have IDE hard drives,
> most of the hard drives are WD (about 10 WD drives in operation now, but
> I've gone got rid of a few of the smaller ones - 340meg drives aren't
> much use now). I have not had a single problem with a WD drive that
> could be attributed to the drive (mis-use of `rm` doesn't count as a
> drive problem). However with the NT systems running SCSI drives
> (Seagate and Maxtor drives mainly with Adaptec, NCR, and DPT
> controllers) I have had heaps of problems. Strange crashes on boot,
> data loss in running system, systems booting up and suddenly crashing
> when previously they had worked fine.
Now you're comparing NT with Linux. There is a stability differnce you
know. I've had the same thing happen on NT with IDE drives, so the problem
is IMHO NT, and not the disks.
> Based on the experiences with SCSI the client has now decided to save
> money and buy IDE - the extra money they spent on SCSI wasn't getting
> them any extra performance or reliability.
>
> As for performance, I recall seeing a message from Mark Lord saying
> that in most Linux systems you won't gain anything from SCSI. Save the
> $300 on a SCSI controller and get 64meg of RAM - it'll make your system
> faster and more reliable than SCSI.
That depends. If it's a multiuser system things like tagged command
queueing will give you much better concurrency on a SCSI disk. SCSI also
gives a lower bus load, leaving more CPU time available for processes. I
have a 3 year old IBM spitfire on an Adaptec 2940, with a 486 CPU acting
as a fileserver that outperforms most pentiums with IDE drives, especialy
when users access it concurrently.
On the other hand if you have a single user desktop machine the extra
money is probably wasted.
>
> Russell Coker
>
- --------------------------------------
Hugo Van den Berg - hbe@cypres.nl
Phone - +31 (0)30 - 60 25 400
Fax - +31 (0)30 - 60 50 799
- --------------------------------------
------------------------------
From: Paul Gortmaker <paul@rasty.anu.edu.au>
Date: Fri, 14 Feb 1997 00:09:16 +1000 (EST)
Subject: Re: Performance patch for NE Ethernet
This probably belongs on linux-net and not linux-kernel, fwiw.
> The following improves the performance of NE* clones.
Not to be too harsh on anyone that is trying to help out, but the
above claim is a bit dubious...
> /* This check _should_not_ be necessary, omit eventually. */
> - while ((inb_p(NE_BASE+EN0_ISR) & ENISR_RESET) == 0)
> + while ((inb_p((NE_BASE+EN0_ISR)) & ENISR_RESET) == 0)
Umm, adding extra sets of "( ... )" doesn't gain anything other than
making it more work for me to read through the patch to see what you
are trying to change. There are lots of these in this patch...
> - outb_p(ENISR_RESET, NE_BASE + EN0_ISR); /* Ack intr. */
> + outb_p(ENISR_RESET, (NE_BASE + EN0_ISR)); /* Ack intr. */
See above...
> - the start of a page, so we optimize accordingly. */
> + the start of a page, so we optimize accordingly.i
Ummm, vi typo constitutes a patch? (Okay, we all do that now and again...)
> - /* This *shouldn't* happen. If it does, it's the last thing you'll see */
> - if (ei_status.dmaing) {
> + /* This *shouldn't* happen. If it does, we'll fix it. */
> + if (ei_status.dmaing)
> + {
No, that is written as such because it most likely *will* be the last
thing you see, due to the way the hardware is. If the printk gets to the
screen, then you have done well. Also, putting a { on a newline doesn't
do anything more than make the patch bigger...
> + ne_reset_8390(dev);
Well, a post "Oops, shouldn't have done that - reset the card." looks
like a good idea, I doubt it will buy you a recovery.
> - outb_p(E8390_NODMA+E8390_PAGE0+E8390_START, nic_base+ NE_CMD);
> + outb_p(E8390_NODMA+E8390_PAGE0+E8390_START, (NE_BASE + NE_CMD));
brackets... NE_BASE macro vs. nic_base will be obscured by GCC.
> - outb_p(sizeof(struct e8390_pkt_hdr), nic_base + EN0_RCNTLO);
> + outb_p(sizeof(struct e8390_pkt_hdr), (NE_BASE + EN0_RCNTLO));
brackets...
> - outb_p(0, nic_base + EN0_RCNTHI);
> - outb_p(0, nic_base + EN0_RSARLO); /* On page boundary */
> + outw(0, (NE_BASE + EN0_RSARLO)); /* On page boundary */
Bzzt. Can't do that. You have ne1000 cards to deal with, as well as
the fact that access to the 8390 requires the outb_p() -- see the
datasheets if you are in doubt.
> - outb_p(ring_page, nic_base + EN0_RSARHI);
> + outb_p(ring_page, (NE_BASE + EN0_RSARHI));
More brackets...
> - outb_p(E8390_RREAD+E8390_START, nic_base + NE_CMD);
> + outb_p(E8390_RREAD+E8390_START, (NE_BASE + NE_CMD));
More brackets...
> - insw(NE_BASE + NE_DATAPORT, hdr, sizeof(struct e8390_pkt_hdr)>>1);
> + insw((NE_BASE + NE_DATAPORT), hdr, sizeof(struct e8390_pkt_hdr)>>1);
More brackets...
> - insb(NE_BASE + NE_DATAPORT, hdr, sizeof(struct e8390_pkt_hdr));
> + insb((NE_BASE + NE_DATAPORT), hdr, sizeof(struct e8390_pkt_hdr));
More brackets...
> - outb_p(ENISR_RDC, nic_base + EN0_ISR); /* Ack intr. */
> + outb_p(ENISR_RDC, (NE_BASE + EN0_ISR)); /* Ack intr. */
More brackets...
>From here on, you have deleted a bunch of code that is ifdef'd out
anyways. That doesn't buy you anything in speed or size.
> - outb_p(count & 0xff, nic_base + EN0_RCNTLO);
> - outb_p(count >> 8, nic_base + EN0_RCNTHI);
> + outw(count, (NE_BASE + EN0_RCNTLO));
Nope, sorry, same as above. ne1000 cards, and you need the i/o pause.
Hitting an 8390 with back-to-back I/O is a good way to fill my mailbox.
> - outb_p(ring_offset & 0xff, nic_base + EN0_RSARLO);
> - outb_p(ring_offset >> 8, nic_base + EN0_RSARHI);
> + outw(ring_offset, (NE_BASE + EN0_RSARLO));
Same as above.
While I can appreciate your effort, you have to realize that 99.9%
of the ne driver CPU load is in the moving the packet. The individual
read/writes to the 8390 registers account for nothing. (Profile the
driver if you don't believe me, as I have done so.) Furthermore, there
are a ridiculous number of cheap half-broken clone cards out there,
and we have to try and work with all of them.
> -#ifdef NE8390_RW_BUGFIX
> - /* Handle the read-before-write bug the same way as the
> - Crynwr packet driver -- the NatSemi method doesn't work.
> - Actually this doesn't always work either, but if you have
> - problems with your NEx000 this is better than nothing! */
[...]
> -#ifdef NE_SANITY_CHECK
> - /* This was for the ALPHA version only, but enough people have
> - been encountering problems so it is still here. */
Furthermore, deleting code that is already #ifdef'ed out doesn't
get you anything in terms of size or performance. When the next
poor sod has to come along and understand the driver, it may help
him/her out a lot to be able to read that code, even if it is inactive.
Paul.
(a.k.a. current unfortunate sod who deals with ne.c)
> Cheers,
> Dick Johnson
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
> Richard B. Johnson
> Project Engineer
> Analogic Corporation
> Voice : (508) 977-3000 ext. 3754
> Fax : (508) 532-6097
> Modem : (508) 977-6870
> Ftp : ftp@boneserver.analogic.com
> Email : rjohnson@analogic.com, johnson@analogic.com
> Penguin : Linux version 2.1.26 on an i586 machine (66.15 BogoMips).
> Warning : It's hard to remain at the trailing edge of technology.
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
------------------------------
From: bengel@xs4all.nl (Roeland Th. Jansen)
Date: 13 Feb 1997 13:12:42 -0000
Subject: Re: MENUCONFIG errors
Tom Olson <tjolson@voicenet.com> wrote:
> I'm running Slackware 96 (2.0.20), I want to use the "old" method to
> configure to configure sound into my kernel, but I seem to be only using
> the "new" method. I read that I can select the old method using "make
> menuconfig". When I run menuconfig I get this:
[....]
> Can you help with this error?
not sure; clean the whole stuff; add the patches to 2.0.29 and try again. if
it again fails, email me and we'll see if something can be worked out.
- --
Grobbebol's Home (Linux 2.0.x i586)
------------------------------
From: "Richard B. Johnson" <root@analogic.com>
Date: Thu, 13 Feb 1997 08:21:07 -0500 (EST)
Subject: Re: 2.0.27 major problems #1 -- 3c59x driver.
On Thu, 13 Feb 1997, Cameron MacKinnon wrote:
> > From: Chris Evans <chris@ferret.lmh.ox.ac.uk>
> > On Wed, 12 Feb 1997, Philip Blundell wrote:
> > > A transmitter access conflict is not disaster. There is no need to
> > > reinitialise the controller - all it means is that the driver's
> > > transmit routine was reentered, and the second transmit was deferred to
> > > avoid contention.
> >
> > I am forced to disagree -- when your card hangs it certainly _is_ a
> > disaster. Additionally, the code implies that that if execution reaches
> > this stage it is a disaster anyway; quote "if this ever happens then the
> > queue layer is doing something evil"
>
> NOT being an expert in the Linux networking code, a few disinterested
> observations:
> - Maybe the evil IS in the queue layer, and others haven't noticed as
> their ethernet performance isn't as stellar as yours. Do the errors
> occur randomly, or only under high load?
[SNIPPED]
The "doing something evil" is quite often that the queue layer tries to
transmit another packet before the last one was transmitted. When you
tell the usual Ethernet chip to transmit a packet, the modern chips, i.e.,
NOT the 8390, only "promise" to get it transmitted. Older chips would
not return "good" status until the packet was actually transmitted. The
chips do automatic retries to handle collisions. The more modern chips
return "good" status as soon as the packet is ready in its buffer. This
allows the driver CPU to do something else while the packet is actually
being sent.
However, most hardware hangs occasionally. The least overhead fix of
all the Ethernet chips I've programmed is to reset the thing and start over.
Even the chips that require being set into loop-back mode during
initialization, take less than a typical packet transmission time to
reprogram. The result is a lost packet which will be re-sent when requested.
Under high load, I often get the "couldn't allocate a sk_buff", and
"memory squeeze, dropping packet", errors. They really should be displayed
only when debugging is turned on. The correct operation to perform when
resources are not available it to drop the packet (which is what many
drivers do). However, some drivers attempt herorics including busy-waiting.
This is most often counter-productive.
I have some tools that record TCP/IP transmission/roundtrip speeds. If
anyone is interested, contact me via private email.
Cheers,
Dick Johnson
- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Richard B. Johnson
Project Engineer
Analogic Corporation
Voice : (508) 977-3000 ext. 3754
Fax : (508) 532-6097
Modem : (508) 977-6870
Ftp : ftp@boneserver.analogic.com
Email : rjohnson@analogic.com, johnson@analogic.com
Penguin : Linux version 2.1.26 on an i586 machine (66.15 BogoMips).
Warning : It's hard to remain at the trailing edge of technology.
- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
------------------------------
From: Brian Lalor <blalor@hcirisc.cs.binghamton.edu>
Date: Thu, 13 Feb 1997 10:15:25 -0500 (EST)
Subject: [none]
I'm currently using the 2.1.25 kernel and I'm running into some SCSI
problems. I'm using an IBM PS/2 Model 95 (MCA) with an IBM MCA SCSI host
adapter. I've been able to boot using an MCA-patched 2.0.7 kernel and
the machine has worked flawlessly. Now that I'm using the newer kernel,
my drives (w hard drives and 1 CD-ROM) appear as different luns on ID 0
and the CD-ROM doesn't work at all. The first time I booted after
installing the CD-ROM drive, it was recognized as (I believe) lun 3 on ID
0, but subsequent boots fail to get it to be recognized at all. The
CD-ROM is a Sony CDU-76S set to id 2 with parity enabled and
allow/disallow (?) removed.
This is from /var/log/messages:
Feb 13 08:44:38 blalor kernel: IBM MCA SCSI: IBM SCSI Adapter w/Cache found in s
lot 1, scsi id=7.
Feb 13 08:44:38 blalor kernel: IBM MCA SCSI: logical device found at ldn=0.
Feb 13 08:44:38 blalor kernel: IBM MCA SCSI: logical device found at ldn=1.
Feb 13 08:44:38 blalor kernel: IBM MCA SCSI: logical device found at ldn=2.
Feb 13 08:44:38 blalor kernel: scsi0 : IBMMCA
Feb 13 08:44:38 blalor kernel: scsi : 1 host.
Feb 13 08:44:38 blalor kernel: Vendor: IBM Model: 0661467 Rev:
G o
Feb 13 08:44:38 blalor kernel: Type: Direct-Access ANSI
SCSI revision: 02
Feb 13 08:44:38 blalor kernel: Detected scsi disk sda at scsi0, channel 0, id 0,
lun 0
Feb 13 08:44:38 blalor kernel: Vendor: FUJITSU Model: M2684S-512 Rev:
2035
Feb 13 08:44:38 blalor kernel: Type: Direct-Access ANSI
SCSI revision: 02
Feb 13 08:44:38 blalor kernel: Detected scsi disk sdb at scsi0, channel 0, id 0,
lun 2
Feb 13 08:44:38 blalor kernel: scsi : detected 2 SCSI disks total.
Feb 13 08:44:38 blalor kernel: SCSI device sda: hdwr sector= 512 bytes.
Sectors= 776456 [379 MB] [0.4 GB]
Feb 13 08:44:38 blalor kernel: SCSI device sdb: hdwr sector= 512 bytes.
Sectors= 1039329 [507 MB] [0.5 GB]
I'd appreciate any help you can give me.
I'm also getting some "oops"'s on boot. What's the best way to capture
these, as they don't seem to appear in /var/log/messages?
Thanks in advance,
Brian
_____________________________________________________________________________
B r i a n L a l o r blalor@hcirisc.cs.binghamton.edu
http://hcirisc.cs.binghamton.edu/~blalor consp32@binghamton.edu
------------------------------
From: Shigehiro Nomura <GBB00111@niftyserve.or.jp>
Date: Fri, 14 Feb 1997 00:14:00 +0900
Subject: Re: 640MB MO patch
Hi Yutaka,
Thanks for your reply.
> It seems (at leaset for me) your attitude is not polite enough to
> Eric and the Linux community.
I wrote:
> ..... has been released a few months ago, too :-)
This is only a fact, there is no other intentions. And,
I wrote:
>> My own patches are on andante.jic.com in pub/scsi.
>I will try it and send a report to you.
That's all.
Is there any problem?
> I think that we should learn/check how things are going on, before
> proposing a patch.
That's right.
Regards,
Shigehiro GBB00111@niftyserve.or.jp
------------------------------
From: worley@ariadne.com (Dale R. Worley)
Date: Thu, 13 Feb 1997 15:28:23 GMT
Subject: Re: Behavior under swap catastrophe?
In article <Mutt.19970212155407.jf@helix.caltech.edu>
jf@ugcs.caltech.edu (Joe Fouche) writes:
I've noticed lately that the behavior of the kernel when some
process goes berserk and fills up all the swap is a little
strange. It seems to start sending SEGV's to many processes as the
large one grows. This wouldn't be so bad, except that init is often
killed. Is a modification to protect the life of init in order? Or
should we just make sure this never happens?
My suspicion is that it is not really the kernel sending SEGV's, but
rather that the programs are calling malloc, which discovers that it
cannot grab more virtual memory, and so returns NULL to its caller.
The caller, blithely ignoring warnings to always check malloc's return
value, attempts to use it and promptly segment-faults.
Supposedly, if virtual memory runs out, Unix is supposed to kill off
processes, starting with the most recently initiated one. Since init
is the oldest process, it is safe.
None of this explains why init dies. Perhaps init is not checking the
return value from malloc?
Dale
- --
Dale R. Worley Ariadne Internet Services
Voice: +1 617-899-7949 Fax: +1 617-899-7946 E-mail: worley@ariadne.com
"Internet-based electronic commerce solutions to real business problems."
------------------------------
From: "Jonathan A. Davis" <jonathan@evergreen.cc.usm.edu>
Date: Thu, 13 Feb 1997 09:42:44 -0600 (CST)
Subject: Re: IDE Disk Problems
On Wed, 12 Feb 1997, Alan Cox wrote:
> > Now, my question is, can (or is it remotely possible) linux be
> > writing something out to the WD IDE drives to break the drive. The
> > analogy being, that you can write to a video controller and cause
> > damage to a monitor. One of our people have suggested that Linux is
> > writing to sector 0 of the WD IDE drive.
>
> No IDE doesnt permit a host system to damage the disk. The bang bang sounds
> like the unit is dead.
>
> > These WDs drives were bought at separate times so we do not suspect
> > that we got hold of a bad batch of drives.
>
Would it happen to be a Caviar 21000 series? We had a lab full of them
(P5/120 Linux workstations). Turns out they had a firmware defect that
*really* showed up under heavy access (read: non-windoze). WD shipped us
a firmware upgrade, but, as we found out, ten minutes worth of access on
the drive already did the damage, of the 29 machines, all but 5 have been
replaced.
WD fixed the problem before introduction of the 31000.
Cheers,
- -Jonathan _ _
- ------------------------------------------------------------->>>>>>>>-(o)(o)---
Jonathan A. Davis | Academic Systems Analyst | Hattiesburg/Gulf Park/Stennis
Computing Center | Box 5171 | 39401-5171 | (601) 266-4103 | davis@cc.usm.edu
http://evergreen.cc.usm.edu/~davis | Linux: The choice of a GNU generation
------------------------------
From: "A.N.Kuznetsov" <kuznet@ms2.inr.ac.ru>
Date: Thu, 13 Feb 1997 19:13:22 +0300 (MSK)
Subject: Re: CONUNDRUM.
Hello!
Seems, one guy guessed correct answer.
He forgot to publish his brilliant ideas so that I'll make it.
Alexey Kuznetsov.
Forwarded message:
> From galexand@sietch.bloomington.in.us Thu Feb 13 04:26:02 1997
> Date: Wed, 12 Feb 1997 20:18:21 -0500 (EST)
> From: Greg Alexander <galexand@sietch.bloomington.in.us>
> To: "A.N.Kuznetsov" <kuznet@ms2.inr.ac.ru>
> Subject: Re: CONUNDRUM.
> In-Reply-To: <199702121746.UAA16330@ms2.inr.ac.ru>
> Message-ID: <Pine.LNX.3.95.970212201603.27250E-100000@sietch.sietch.bloomington.in.us>
> MIME-Version: 1.0
> Content-Type: TEXT/PLAIN; charset=US-ASCII
>
> On Wed, 12 Feb 1997, A.N.Kuznetsov wrote:
>
> > The explanations sort of "It is cache problem" or
> > "It is TLB trashing" are not accepted.
>
> You dumbass. I don't care if they are acceptable, but them's the limits of
> the hardware. A multi-tasking OS will be notably slower for some things
> than a non-multi-tasking OS such as DOS or windows.
>
> > Bare MSDOS, Windows3.11 and Windows95 show the same (good) result.
> >
> > You can get the program sources, necessary data,
> > and msdos binaries at ftp.inr.ac.ru:/CONUNDRUM.
> >
> > It prints lines sort of:
> > Step 0 of 6400 45.917 0
> > ^---------- Time measured by P5 CPU, normilized
> > to seconds, supposing that CPU clock is 100MHz.
> > So that, it is seconds only for 100MHz cpu.
>
> I'm fairly certain that your timing shit will not work under a 32-bit
> prot-mode OS in the same way as it would under DOS. Use the time command
> instead.
>
> > For MSDOS I see:
> > Step 0 of 6400 39.376 0
> >
> > Do not ask me what this program does, and why
> > 1.dg3 file is so huge. I have no idea.
> > The only thing that I know is that binary codes
> > really coincide.
>
> Then you know nothing and you are ignorant and have no right to state that
> it is slower. Watcom files will not run under Linux. I don't care how much
> they coincide.
>
> Greg Alexander
> http://www.cia-g.com/~sietch/
>
------------------------------
From: DUPRE Christophe <duprec@JSP.UMontreal.CA>
Date: Thu, 13 Feb 1997 11:21:46 -0500 (EST)
Subject: Linux & EISA bus ?
Is there any special configuration needed to use the EISA bus of a system ?
kernel is 2.0.28, and I seem to be unable to access a network card (DEC
425) on the EISA bus of a Digital Prioris HS 590...
Of course, I could have done something wrong with the System
Configuration Utility, since it's horribly documented, but I'd like to
search all the avenues at the same time...
Christophe Dupre Universite de Montreal
Montreal, Qc, Canada
"Nous ne sommes pas libres de ne pas etre libres, nous sommes obliges de
l'etre" - Fernando Savater
- -----BEGIN GEEK CODE BLOCK-----
Version: 3.12
GCS d- s:++ a-- C++(+++) UL++++$ UISV++ P+++ L+++ !E---- W+++$ N+ o? K w---
O M- V-- PS+ PE+ Y+ PGP+ t+ 5++ X+ R+ tv+ b++ DI- D G+ e>++ h- r++ z+
- ------END GEEK CODE BLOCK------
#include <disclaimer.h>
------------------------------
From: Mark Hemment <markhe@nextd.demon.co.uk>
Date: Thu, 13 Feb 1997 13:26:41 +0000 (GMT)
Subject: Re: Linux VM subsystem (Was: Big mallocs, mmap sorrows and double buffering.)
Hi,
I'm re-implementing the VM subsystem as I write this.
It's still v. much in the prototyping stage, but I've already
got mapping chains working for named pages (without any panics, or
object leaks!!!). When I've got chaining for anon pages working, it
will open the door to some nifty tricks (page flipping for some archs
(such as Intel), true page aging, true swap page clustering, and the death
of swap_out_process()).
My initial changes shrunk the size of the mem_map structure, which caused
an immediate improvement in find_page() (locality of reference and all
that).
Should have a 'developement' patch out in a copy of weeks.
On Wed, 12 Feb 1997, Ingo Molnar wrote:
> On Wed, 12 Feb 1997, A.N.Kuznetsov wrote:
> > The statement is that other unices make this procedure
> > without pain, time grows almost linearly, and when running
> > with low priority in background it practically does not hamper other users.
> > Linux time grows EXPONENTIALLY after starting swap [...]
>
> i remember Linus commenting on such issues about a year ago. He proposed
> some code, and told people to send patches to him.
- ------------------------------------------------------------------
Mark Hemment, Unix/C Software Engineer (Contractor)
markhe@nextd.demon.co.uk http://www.demon.co.uk/
"Success has many fathers, failure is a B**TARD!" - anon
- ------------------------------------------------------------------
------------------------------
From: "Hugh W. Holbrook" <holbrook@DSG.Stanford.EDU>
Date: Thu, 13 Feb 1997 09:50:00 -0800
Subject: "Modules Oops" workaround for 2.1.26
I disabled the CONFIG_MODVERSIONS ("set version information on all
modules") option in my kernel configuration and recompiled everything.
Since doing so, depmod, modprobe and insmod seem to work again without
segfaulting.
depmod -a was previously segfaulting in the same way that many other
people on this list have noted (dump attached).
I'm running 2.1.26, libc-5.4.17, binutils 2.7.0.9, ld.so-1.8.9
I hope that this will be helpful in debugging the problem.
- -Hugh Holbrook
Here the type of oops that 'depmod -a' previously was causing:
- ----------------------------------------------------------------
Feb 13 00:37:58 Fridge kernel: Unable to handle kernel paging request at virtual address ecfcc01d
Feb 13 00:37:58 Fridge kernel: current->tss.cr3 = 02883000, |r3 = 02883000
Feb 13 00:37:58 Fridge kernel: *pde = 00000000
Feb 13 00:37:58 Fridge kernel: Oops: 0000
Feb 13 00:37:58 Fridge kernel: CPU: 0
Feb 13 00:37:58 Fridge kernel: EIP: 0010:[<c011695d>]
Feb 13 00:37:58 Fridge kernel: EFLAGS: 00010203
Feb 13 00:37:58 Fridge kernel: eax: 8fb00000 ebx: c01ddbd0 ecx: 0000003b edx: ecfcc01d
Feb 13 00:37:58 Fridge kernel: esi: ecfcc01d edi: c287ff84 ebp: c01c1422 esp: c287ff5c
Feb 13 00:37:58 Fridge kernel: ds: 0018 es: 0018 ss: 0018
Feb 13 00:37:58 Fridge kernel: Process depmod (pid: 112, process nr: 28, stackpage=c287f000)
Feb 13 00:37:58 Fridge kernel: Stack: c2990c0c bffff73c 00000002 bfffa7b8 c287ff80 c287ff84 c287ff85 00000000
Feb 13 00:37:58 Fridge kernel: 00000001 8fb00000 00000023 00000000 00000000 00000000 00000000 00000000
Feb 13 00:37:58 Fridge kernel: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Feb 13 00:37:58 Fridge kernel: Call Trace: [<c010a436>]
Feb 13 00:37:58 Fridge kernel: Code: ac aa 84 c0 75 f7 f3 aa c6 44 24 63 00 ba 40 00 00 00 a1 5c
------------------------------
From: hpa@transmeta.com (H. Peter Anvin)
Date: 13 Feb 1997 17:52:13 GMT
Subject: Re: Version bug in 2.0.29?
Followup to: <Pine.LNX.3.95.970208102313.666A-100000@ppp3.csudh.EDU>
By author: Trevor Johnson <trevor@blues.jpj.net>
In newsgroup: linux.dev.kernel
>
> There does seem to be a problem: "make oldconfig" doesn't create
> include/linux/version.h, whereas "make menuconfig" makes a correct one:
>
> /usr/src/linux# cat ./include/linux/version.h
> #define UTS_RELEASE "2.0.29"
> #define LINUX_VERSION_CODE 131101
>
A worse problem is that "make oldconfig" goes into an infinite loop
over the sound stuff, at least in our configuration. "make xconfig"
does do the right thing, though.
-hpa
- --
This space intentionally has nothing but text explaining why this
space has nothing but text explaining that this space would otherwise
have been left blank, and would otherwise have been left blank.
------------------------------
From: "shadow" <shadow@rpa.net>
Date: Thu, 13 Feb 1997 12:57:06 +0000
Subject: Thanks, and another Question
I would like to thanks those who answered my questions about having more than 64 megs in a machine at once. Thank you kindly gods :).
Ok, my second question is based on the same machine, unfortunaly it has been behaving badly still, this machine has allways been a bother sense the beginning, it has crashed under freebsd as well as linux(from 1.13? to 2.0.27) it has in it an adaptec 2940uw with seagate 2.1 uw scsi drive as well as an eide drive. and a cheap 16 ne1000 compatable ethernet card and an PCI smc-ultra ethernet card as well.
The errors range from trashing the scsi partions (that get bad enough to kill the kernel) to just protection faults, to VFS faults, and free block list corrupt? I know these are vague, but as there is more than one. I was looking at the kernel help, and the pages say in order to run with more than 64 megs of memory, it needs 512 k cache?
If anyone has any ideas, i would love to hear them. Thank you for your time and patience.
Scott Traynor
scott@rpa.net
------------------------------
From: Nathan Bryant <nathan@burgessinc.com>
Date: Thu, 13 Feb 1997 13:13:59 -0500 (EST)
Subject: Re: CONUNDRUM.
On Thu, 13 Feb 1997, A.N.Kuznetsov wrote:
> Hello!
>
> Seems, one guy guessed correct answer.
> He forgot to publish his brilliant ideas so that I'll make it.
Umm, I'll assume you're joking. His ideas don't sound so brilliant
to me.
>
> Alexey Kuznetsov.
>
>
> Forwarded message:
> > From galexand@sietch.bloomington.in.us Thu Feb 13 04:26:02 1997
> > Date: Wed, 12 Feb 1997 20:18:21 -0500 (EST)
> > From: Greg Alexander <galexand@sietch.bloomington.in.us>
> > To: "A.N.Kuznetsov" <kuznet@ms2.inr.ac.ru>
> > Subject: Re: CONUNDRUM.
> > In-Reply-To: <199702121746.UAA16330@ms2.inr.ac.ru>
> > Message-ID: <Pine.LNX.3.95.970212201603.27250E-100000@sietch.sietch.bloomington.in.us>
> > MIME-Version: 1.0
> > Content-Type: TEXT/PLAIN; charset=US-ASCII
> >
> > On Wed, 12 Feb 1997, A.N.Kuznetsov wrote:
> >
> > > The explanations sort of "It is cache problem" or
> > > "It is TLB trashing" are not accepted.
> >
> > You dumbass. I don't care if they are acceptable, but them's the limits of
Dumbass?
> > the hardware. A multi-tasking OS will be notably slower for some things
> > than a non-multi-tasking OS such as DOS or windows.
He's missing the point here: Windows *is* a multitasking operating system.
Even Windows 3.1 multitasks DOS applications *preemptively.* And you've
said that your program has the same results under Windows as it does under
DOS...
> >
> > > Bare MSDOS, Windows3.11 and Windows95 show the same (good) result.
> > >
> > > You can get the program sources, necessary data,
> > > and msdos binaries at ftp.inr.ac.ru:/CONUNDRUM.
> > >
> > > It prints lines sort of:
> > > Step 0 of 6400 45.917 0
> > > ^---------- Time measured by P5 CPU, normilized
> > > to seconds, supposing that CPU clock is 100MHz.
> > > So that, it is seconds only for 100MHz cpu.
> >
> > I'm fairly certain that your timing shit will not work under a 32-bit
> > prot-mode OS in the same way as it would under DOS. Use the time command
> > instead.
I don't think this is right, although I may be wrong. (Although Pentium
cycle-counters would also count all the overhead of the OS.) It may be
interesting, though, to see how much time the process was actually
scheduled for, using the times() syscall.
> >
> > > For MSDOS I see:
> > > Step 0 of 6400 39.376 0
> > >
> > > Do not ask me what this program does, and why
> > > 1.dg3 file is so huge. I have no idea.
> > > The only thing that I know is that binary codes
> > > really coincide.
> >
> > Then you know nothing and you are ignorant and have no right to state that
> > it is slower. Watcom files will not run under Linux. I don't care how much
> > they coincide.
Know nothing? Ignorant? I don't think so. If you had taken the time to
read his mail, you would realize that the Watcom code will run just fine,
since it's not doing any syscalls and just doing computations on memory.
He's disassembled the Watcom-produced code and assembled it again on
Linux.
> >
> > Greg Alexander
> > http://www.cia-g.com/~sietch/
> >
>
+-----------------------+---------------------------------------+
| Nathan Bryant | Unsolicited commercial e-mail WILL be |
| nathan@burgessinc.com | charged an $80/hr proofreading fee. |
+-----------------------+---------------------------------------+
------------------------------
From: alan@lxorguk.ukuu.org.uk (Alan Cox)
Date: Thu, 13 Feb 1997 08:43:31 +0000 (GMT)
Subject: Re: CONUNDRUM.
> > First guess Alexey. Linux crt0.s doesnt align the stack on an eight byte
> > boundary.
> Any reason why it shouldn't? Could this simple thing case a >10%
> performance drop for some applications? If so why wasn't this noticed and
> fixed long ago?
Because its incredibly non obvious. Some of the folk doing big number
crunches noticed that the size of your environment changed the performance.
They have submitted both a fix to crt0.o and a mod to gcc to make it
align doubles on 8 byte boundaries. Im hopeful that will make gcc2.8
------------------------------
From: Scott_N._Lutz@cd.geodyn.com
Date: 13 Feb 1997 18:17:03 GMT
Subject: Resyncs
I have written a device driver in Linux 1.2.8 that handles interrupts from a
bit/frame sync card and writes the data to a file on the hard disk. The
system has been running for 8 months on a deployed system which uses the
adaptec fast&wide pci/scsi controller (aic7880?) and a Quantum HD. We have
recently bought another system with a similar hardware configuration except
it is a Seagate Barracuda drive. Since I have lost my 1.2.8 CD I loaded
Slakware 3.1 with the 2.0.0 kernel. This required me to slightly modify the
device driver code (request_irq, free_irq, and the interrupt now provide
support for shared interrupts?) which was trivial. I have since loaded the
2.0.28 kernel, but no other patches.
The problem is that when I run the capture software I get continuous resyncs,
which in the past has been because the process couldn't write data to the
disk fast enough before the FIFO buffer on the card filled up. I don't
believe that the hardware is too slow so it must be a problem with the
operating system or my code. I'm willing to provide more specific
information if you'll let me know what is needed.
Scott N. Lutz
S_Lutz@cd.Geodyn.com
- --
**************************************************************************
* Sent via FirstClass(R) UUCP gateway from Logicon Geodynamics, Inc (CD) *
**************************************************************************
------------------------------
From: Andras Kiraly <akiraly@iiic.ethz.ch>
Date: Thu, 13 Feb 1997 19:39:34 +0100 (MET)
Subject: Sony CDU33a+kernel>2.1.22=system hangs totally
Dear Linux Developpers
I'm experiencing following problem since kernel V2.1.23:
Everytimes I mount a CD-Rom, the system hangs totally, no keyboard
input, screen-switching and CTR-ALT-DEL is possible anymore. The
only thing I can do is resetting the computer with the reset-key.
If I step back to a kenel <2.1.23, the problem is not around. With
kernel 2.1.22 CD-Rom operation is no problem. The Cd-Rom works without
problems with DOS6.20 and Win311 too.
My System configuration:
Sony CDU33a with the Sony COR-334 ISA-Adapter,
base Adress 340, No IRQ, DMA channel 0
ASUS P/I-p55t2p4 Motherboard, Cyrix P150+ CPU, 64MB RAM
Please write me which further information you need or if the problem
is already solved, what the solution is.
Best regards,
Kiraly Andras
------------------------------
From: "Charles W. Doolittle, N1SPX" <n1spx@snet.net>
Date: Thu, 13 Feb 1997 13:44:02 -0500 (EST)
Subject: insmod
Using insmod v2.0.0 (by insmod -v), with kernel 2.1.25, and AFAIK no
version support.
When I insmod, it segfaults. Up on the console appears a kernel trace
regarding "The Kernel Could Not Handle an Invalid Paging Request..."
I hope its simple.
Chuck
N1SPX
------------------------------
From: Ingo Molnar <mingo@pc5829.hil.siemens.at>
Date: Thu, 13 Feb 1997 20:07:10 +0100 (MET)
Subject: Re: CONUNDRUM.
On Thu, 13 Feb 1997, Alan Cox wrote:
> Because its incredibly non obvious. Some of the folk doing big number
> crunches noticed that the size of your environment changed the
> performance. [...]
hm, maybe time to add 'misalignment fault' counting support to the kernel?
Has anyone tried this? Does it work for the FPU too? Is it reliable? ;)
i remember Dave has made this for the Sparc, and there is something like
this for x86 too.
- -- mingo
------------------------------
From: "Richard B. Johnson" <root@analogic.com>
Date: Thu, 13 Feb 1997 14:20:32 -0500 (EST)
Subject: Re: Performance patch for NE Ethernet
On Fri, 14 Feb 1997, Paul Gortmaker wrote:
>
> Not to be too harsh on anyone that is trying to help out, but the
> above claim is a bit dubious...
>
> > /* This check _should_not_ be necessary, omit eventually. */
> > - while ((inb_p(NE_BASE+EN0_ISR) & ENISR_RESET) == 0)
> > + while ((inb_p((NE_BASE+EN0_ISR)) & ENISR_RESET) == 0)
>
> Umm, adding extra sets of "( ... )" doesn't gain anything other than
> making it more work for me to read through the patch to see what you
> are trying to change. There are lots of these in this patch...
NE_BASE is a MACRO (#define), ENO_ISR is a MACRO (#define). inb_p() is
a MACRO that defines a compiler-specific "inline function".
We are _required_ to use parenthesis around concatenated macros and
parameters passed to macros to guarantee that the code you specify
is what you get. Whether or not one gets away with sloppy coding using
gcc is not pertainent.
If you bothered to apply the patch to a copy of ne.c, you would see
that I got rid of most of the trash.
>
> > - outb_p(ENISR_RESET, NE_BASE + EN0_ISR); /* Ack intr. */
> > + outb_p(ENISR_RESET, (NE_BASE + EN0_ISR)); /* Ack intr. */
>
> See above...
Ditto;
>
> > - the start of a page, so we optimize accordingly. */
> > + the start of a page, so we optimize accordingly.i
>
> Ummm, vi typo constitutes a patch? (Okay, we all do that now and again...)
It is inside a comment so wasn't detected.
>
> > - /* This *shouldn't* happen. If it does, it's the last thing you'll see */
> > - if (ei_status.dmaing) {
> > + /* This *shouldn't* happen. If it does, we'll fix it. */
> > + if (ei_status.dmaing)
> > + {
>
> No, that is written as such because it most likely *will* be the last
> thing you see, due to the way the hardware is. If the printk gets to the
> screen, then you have done well. Also, putting a { on a newline doesn't
> do anything more than make the patch bigger...
Not true. Absolutely not true.
>
> > + ne_reset_8390(dev);
>
> Well, a post "Oops, shouldn't have done that - reset the card." looks
> like a good idea, I doubt it will buy you a recovery.
>
> > - outb_p(E8390_NODMA+E8390_PAGE0+E8390_START, nic_base+ NE_CMD);
> > + outb_p(E8390_NODMA+E8390_PAGE0+E8390_START, (NE_BASE + NE_CMD));
>
> brackets... NE_BASE macro vs. nic_base will be obscured by GCC.
nic_base was an extra operation. Read the code.
On line 622 there is even "int nic_base = NE_BASE"
>
> > - outb_p(sizeof(struct e8390_pkt_hdr), nic_base + EN0_RCNTLO);
> > + outb_p(sizeof(struct e8390_pkt_hdr), (NE_BASE + EN0_RCNTLO));
>
> brackets...
>
> > - outb_p(0, nic_base + EN0_RCNTHI);
> > - outb_p(0, nic_base + EN0_RSARLO); /* On page boundary */
> > + outw(0, (NE_BASE + EN0_RSARLO)); /* On page boundary */
>
> Bzzt. Can't do that. You have ne1000 cards to deal with, as well as
> the fact that access to the 8390 requires the outb_p() -- see the
> datasheets if you are in doubt.
Oh yes you can. This is not a '286. The 8390 has a chip-to-chip select-time
specification (300ns) that does not apply to the data port or the page
registers. Configuration registers have that limitation because of internal
state-machine considerations. Film at 11.
>
> > - outb_p(ring_page, nic_base + EN0_RSARHI);
> > + outb_p(ring_page, (NE_BASE + EN0_RSARHI));
>
> More brackets...
>
> > - outb_p(E8390_RREAD+E8390_START, nic_base + NE_CMD);
> > + outb_p(E8390_RREAD+E8390_START, (NE_BASE + NE_CMD));
>
> More brackets...
>
> > - insw(NE_BASE + NE_DATAPORT, hdr, sizeof(struct e8390_pkt_hdr)>>1);
> > + insw((NE_BASE + NE_DATAPORT), hdr, sizeof(struct e8390_pkt_hdr)>>1);
>
> More brackets...
>
> > - insb(NE_BASE + NE_DATAPORT, hdr, sizeof(struct e8390_pkt_hdr));
> > + insb((NE_BASE + NE_DATAPORT), hdr, sizeof(struct e8390_pkt_hdr));
>
> More brackets...
>
> > - outb_p(ENISR_RDC, nic_base + EN0_ISR); /* Ack intr. */
> > + outb_p(ENISR_RDC, (NE_BASE + EN0_ISR)); /* Ack intr. */
>
> More brackets...
>
> >From here on, you have deleted a bunch of code that is ifdef'd out
> anyways. That doesn't buy you anything in speed or size.
Yes I did.
>
> > - outb_p(count & 0xff, nic_base + EN0_RCNTLO);
> > - outb_p(count >> 8, nic_base + EN0_RCNTHI);
> > + outw(count, (NE_BASE + EN0_RCNTLO));
>
> Nope, sorry, same as above. ne1000 cards, and you need the i/o pause.
> Hitting an 8390 with back-to-back I/O is a good way to fill my mailbox.
>
The film at 11.
Not true on registers that set addresses or the data port registers. It
is not the port reads/writes themselves that cause the problems, but
commands that force internal configuration changes. For instance, if
you even READ the status register while a "remote DMA" is in progress,
the chip's bus interface state-machine will hang your bus forever.
But note that someone else already fixed this problem by using a "reentry"
flag "ei_status.dmaing". After this addition, much of the earlier
attempts to fix this problem were no longer necessary. I have extensively
used the LONGSHINE LCS-8634L (12 machines, 9 only 386's) using the code
changes. This real cheap clone board is one of the worst offenders for
bus-hangs.
> > - outb_p(ring_offset & 0xff, nic_base + EN0_RSARLO);
> > - outb_p(ring_offset >> 8, nic_base + EN0_RSARHI);
> > + outw(ring_offset, (NE_BASE + EN0_RSARLO));
>
> Same as above.
> get you anything in terms of size or performance. When the next
> poor sod has to come along and understand the driver, it may help
> him/her out a lot to be able to read that code, even if it is inactive.
>
> Paul.
> (a.k.a. current unfortunate sod who deals with ne.c)
It gets rid of 4 years worth of unused junk. The _entire_ input routine
becomes:
static void
ne_block_input(struct device *dev, int count, struct sk_buff *skb, int ring_offset)
{
if(ei_status.dmaing)
return;
ei_status.dmaing |= 0x01;
++count;
count &= ~1;
outb_p(E8390_NODMA+E8390_PAGE0+E8390_START, (NE_BASE + NE_CMD));
outw(count, (NE_BASE + EN0_RCNTLO));
outw(ring_offset, (NE_BASE + EN0_RSARLO));
outb_p(E8390_RREAD+E8390_START, (NE_BASE + NE_CMD));
if(ei_status.word16)
insw((NE_BASE + NE_DATAPORT), BUF, count>>1);
else
insb((NE_BASE + NE_DATAPORT), BUF, count);
outb_p(ENISR_RDC, (NE_BASE + EN0_ISR)); /* Ack intr. */
ei_status.dmaing &= ~0x01;
return;
}
The output routine is also shortened AND made clearer. You can't just
look at a patch to determine its validity.
Cheers,
Dick Johnson
- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Richard B. Johnson
Project Engineer
Analogic Corporation
Voice : (508) 977-3000 ext. 3754
Fax : (508) 532-6097
Modem : (508) 977-6870
Ftp : ftp@boneserver.analogic.com
Email : rjohnson@analogic.com, johnson@analogic.com
Penguin : Linux version 2.1.26 on an i586 machine (66.15 BogoMips).
Warning : It's hard to remain at the trailing edge of technology.
- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
------------------------------
From: worley@ariadne.com (Dale R. Worley)
Date: Thu, 13 Feb 1997 14:28:32 -0500
Subject: Report on compiling 2.1.26
I've just made a reasonably clean compile of 2.1.26 with "everything"
configured in (no modules, though). This is the list of problems I
noticed. Some of the problems I can't fix (for instance, much of the
ISDN code hasn't been updated to take into account recent kernel
changes), but the ones for which I could make real fixes, I've
included patches below.
1. arch/i386/kernel/time.c
Some code regarding lost_ticks and USECS_PER_JIFFY wasn't suppressed
when CONFIG_APM was set. Also, I marked some #endif's to tell what
the matching #ifdef's were testing on.
2. drivers/ap1000/{apfddi.c,apfddi.h,bif.c}
The structure enet_statistics has been replaced by net_device_stats.
(I think.)
3. drivers/isdn/{isdn_audio.c,isdn_common.c,isdn_net.c,isdn_ppp.c,isdn_tty.c,
pcbit/capi.c,pcbit/layer2.c}
The 'free', 'lock', and 'len' fields of struct sk_buff no longer exist.
4. drivers/isdn/{isdn_common.c,isdn_ppp.c}
select_wait seems to have been replaced by a 'poll' function, but the
details of how to replace the one with the other are unclear.
5. drivers/isdn/isdn_net.c
The 'header_cache_bind' field is no longer in struct device.
6. drivers/net/cs89x0.c
Missing final */ on a comment.
7. drivers/net/cs89x0.c
"FAULT" used instead of "EFAULT".
8. drivers/net/cs89x0.c
Extra newline damaging a #define.
9. drivers/net/sdla.c
Unknown function amemcpy() used.
10. drivers/sound/lowlevel/aci.c
Use of get_user not updated for new interface.
11. net/802/fddi.c
ETH_P_8022 used instead of ETH_P_802_2.
12. scripts/Configure
Cause defaulted values to be output when they are used.
Extraneous backslash prevents integer answers from being accepted.
- ----------------------------------------------------------------------
diff -u arch/i386/kernel/time.c.orig arch/i386/kernel/time.c
- --- arch/i386/kernel/time.c.orig Wed Feb 12 09:17:44 1997
+++ arch/i386/kernel/time.c Wed Feb 12 09:19:44 1997
@@ -128,7 +128,7 @@
return edx;
}
- -#endif
+#endif /* !CONFIG_APM */
/* This function must be called with interrupts disabled
* It was inspired by Steve McCanne's microtime-i386 for BSD. -- jrs
@@ -265,12 +265,14 @@
*tv = xtime;
tv->tv_usec += do_gettimeoffset();
+#ifndef CONFIG_APM
/*
* xtime is atomically updated in timer_bh. lost_ticks is
* nonzero if the timer bottom half hasnt executed yet.
*/
if (lost_ticks)
tv->tv_usec += USECS_PER_JIFFY;
+#endif /* !CONFIG_APM */
restore_flags(flags);
@@ -422,7 +424,7 @@
"=d" (last_timer_cc.high));
timer_interrupt(irq, NULL, regs);
}
- -#endif
+#endif /* !CONFIG_APM */
/* Converts Gregorian date to seconds since 1970-01-01 00:00:00.
* Assumes input in normal date format, i.e. 1980-12-31 23:59:59
@@ -530,6 +532,6 @@
"=d" (init_timer_cc.high));
irq0.handler = pentium_timer_interrupt;
}
- -#endif
+#endif /* !CONFIG_APM */
setup_x86_irq(0, &irq0);
}
diff -u drivers/ap1000/apfddi.c.orig drivers/ap1000/apfddi.c
- --- drivers/ap1000/apfddi.c.orig Mon Feb 10 09:53:17 1997
+++ drivers/ap1000/apfddi.c Mon Feb 10 10:44:13 1997
@@ -126,7 +126,7 @@
static u_char apfddi_saddr[6] = { 0x42, 0x9a, 0x08, 0x6e, 0x11, 0x41 };
struct device *apfddi_device = NULL;
- -struct enet_statistics *apfddi_stats = NULL;
+struct net_device_stats *apfddi_stats = NULL;
volatile struct apfddi_queue *apfddi_queue_top = NULL;
@@ -254,7 +254,7 @@
static void apfddi_interrupt(int irq, void *dev_id, struct pt_regs *regs);
static int apfddi_xmit(struct sk_buff *skb, struct device *dev);
int apfddi_rx(struct mac_buf *mbuf);
- -static struct enet_statistics *apfddi_get_stats(struct device *dev);
+static struct net_device_stats *apfddi_get_stats(struct device *dev);
#if APFDDI_DEBUG
void dump_packet(char *action, char *buf, int len, int seq);
#endif
@@ -496,11 +496,11 @@
dev->stop = apfddi_stop;
dev->hard_start_xmit = apfddi_xmit;
dev->get_stats = apfddi_get_stats;
- - dev->priv = kmalloc(sizeof(struct enet_statistics), GFP_ATOMIC);
+ dev->priv = kmalloc(sizeof(struct net_device_stats), GFP_ATOMIC);
if (dev->priv == NULL)
return -ENOMEM;
- - memset(dev->priv, 0, sizeof(struct enet_statistics));
- - apfddi_stats = (struct enet_statistics *)apfddi_device->priv;
+ memset(dev->priv, 0, sizeof(struct net_device_stats));
+ apfddi_stats = (struct net_device_stats *)apfddi_device->priv;
/* Initialise the fddi device structure */
for (i = 0; i < DEV_NUMBUFFS; i++)
@@ -692,9 +692,9 @@
/*
* Return statistics of fddi driver.
*/
- -static struct enet_statistics *apfddi_get_stats(struct device *dev)
+static struct net_device_stats *apfddi_get_stats(struct device *dev)
{
- - return((struct enet_statistics *)dev->priv);
+ return((struct net_device_stats *)dev->priv);
}
diff -u drivers/ap1000/apfddi.h.orig drivers/ap1000/apfddi.h
- --- drivers/ap1000/apfddi.h.orig Mon Feb 10 09:53:17 1997
+++ drivers/ap1000/apfddi.h Mon Feb 10 10:44:13 1997
@@ -138,5 +138,5 @@
void set_cf_join(int on);
extern struct device *apfddi_device;
- -extern struct enet_statistics *apfddi_stats;
+extern struct net_device_stats *apfddi_stats;
diff -u drivers/ap1000/bif.c.orig drivers/ap1000/bif.c
- --- drivers/ap1000/bif.c.orig Mon Feb 10 09:53:18 1997
+++ drivers/ap1000/bif.c Mon Feb 10 10:44:13 1997
@@ -46,14 +46,14 @@
#define BIF_MTU 10240
static struct device *bif_device = 0;
- -static struct enet_statistics *bif_stats = 0;
+static struct net_device_stats *bif_stats = 0;
int bif_init(struct device *dev);
int bif_open(struct device *dev);
static int bif_xmit(struct sk_buff *skb, struct device *dev);
int bif_rx(struct sk_buff *skb);
int bif_stop(struct device *dev);
- -static struct enet_statistics *bif_get_stats(struct device *dev);
+static struct net_device_stats *bif_get_stats(struct device *dev);
static int bif_hard_header(struct sk_buff *skb, struct device *dev,
unsigned short type, void *daddr,
@@ -128,11 +128,11 @@
dev->open = bif_open;
dev->flags = IFF_NOARP; /* Don't use ARP on this device */
dev->family = AF_INET;
- - dev->priv = kmalloc(sizeof(struct enet_statistics), GFP_KERNEL);
+ dev->priv = kmalloc(sizeof(struct net_device_stats), GFP_KERNEL);
if (dev->priv == NULL)
return -ENOMEM;
- - memset(dev->priv, 0, sizeof(struct enet_statistics));
- - bif_stats = (struct enet_statistics *)bif_device->priv;
+ memset(dev->priv, 0, sizeof(struct net_device_stats));
+ bif_stats = (struct net_device_stats *)bif_device->priv;
dev->stop = bif_stop;
@@ -282,8 +282,8 @@
/*
* Return statistics of bif driver.
*/
- -static struct enet_statistics *bif_get_stats(struct device *dev)
+static struct net_device_stats *bif_get_stats(struct device *dev)
{
- - return((struct enet_statistics *)dev->priv);
+ return((struct net_device_stats *)dev->priv);
}
diff -u drivers/net/cs89x0.c.orig drivers/net/cs89x0.c
- --- drivers/net/cs89x0.c.orig Mon Feb 10 10:31:22 1997
+++ drivers/net/cs89x0.c Mon Feb 10 19:25:54 1997
@@ -1179,4 +1179,4 @@
* c-indent-level: 8
* tab-width: 8
* End:
- - *
+ */
diff -u drivers/net/dlci.c.orig drivers/net/dlci.c
- --- drivers/net/dlci.c.orig Mon Feb 10 18:20:48 1997
+++ drivers/net/dlci.c Mon Feb 10 18:20:54 1997
@@ -296,7 +296,7 @@
if (!get)
{
if(copy_from_user(&config, conf, sizeof(struct dlci_conf)))
- - return -FAULT;
+ return -EFAULT;
if (config.flags & ~DLCI_VALID_FLAGS)
return(-EINVAL);
memcpy(&dlp->config, &config, sizeof(struct dlci_conf));
diff -u drivers/net/ni52.c.orig drivers/net/ni52.c
- --- drivers/net/ni52.c.orig Mon Feb 10 17:24:09 1997
+++ drivers/net/ni52.c Mon Feb 10 17:24:12 1997
@@ -164,8 +164,7 @@
#define DELAY_18(); { __delay( (loops_per_sec>>18)+1 ); }
/* wait for command with timeout: */
- -#define WAIT_4_SCB_CMD()
- -{ int i; \
+#define WAIT_4_SCB_CMD() { int i; \
for(i=0;i<16384;i++) { \
if(!p->scb->cmd_cuc) break; \
DELAY_18(); \
diff -u drivers/sound/lowlevel/aci.c.orig drivers/sound/lowlevel/aci.c
- --- drivers/sound/lowlevel/aci.c.orig Fri Oct 25 06:06:35 1996
+++ drivers/sound/lowlevel/aci.c Mon Feb 10 20:29:35 1997
@@ -292,7 +292,7 @@
int vol, ret;
unsigned param;
- - param = get_user((int *) arg);
+ get_user(param, (int *) arg);
/* left channel */
vol = param & 0xff;
if (vol > 100) vol = 100;
@@ -318,8 +318,12 @@
/* handle solo mode control */
if (cmd == SOUND_MIXER_PRIVATE1) {
- - if (get_user((int *) arg) >= 0) {
- - aci_solo = !!get_user((int *) arg);
+ int temp;
+
+ get_user(temp, (int *) arg);
+ if (temp >= 0) {
+ get_user(temp, (int *) arg);
+ aci_solo = !!temp;
if (write_cmd(0xd2, aci_solo)) return -EIO;
} else if (aci_version >= 0xb0) {
if ((status = read_general_status()) < 0) return -EIO;
@@ -332,6 +336,8 @@
if (cmd & IOC_IN)
/* read and write */
switch (cmd & 0xff) {
+ int temp;
+
case SOUND_MIXER_VOLUME:
return setvolume(arg, 0x01, 0x00);
case SOUND_MIXER_CD:
@@ -349,7 +355,8 @@
case SOUND_MIXER_LINE2: /* AUX2 */
return setvolume(arg, 0x3e, 0x36);
case SOUND_MIXER_IGAIN: /* MIC pre-amp */
- - vol = get_user((int *) arg) & 0xff;
+ get_user(temp, (int *) arg);
+ vol = temp & 0xff;
if (vol > 100) vol = 100;
vol = SCALE(100, 3, vol);
if (write_cmd(0x03, vol)) return -EIO;
diff -u net/802/fddi.c.orig net/802/fddi.c
- --- net/802/fddi.c.orig Tue Feb 11 09:46:35 1997
+++ net/802/fddi.c Tue Feb 11 09:46:38 1997
@@ -131,7 +131,7 @@
if(fddi->hdr.llc_8022_1.dsap==0xe0)
{
skb_pull(skb, FDDI_K_8022_HLEN-3);
- - type=htons(ETH_P_8022);
+ type=htons(ETH_P_802_2);
}
else
{
diff -u net/802/p8022.c.orig net/802/p8022.c
- --- net/802/p8022.c.orig Mon Feb 10 10:35:16 1997
+++ net/802/p8022.c Tue Feb 11 09:48:12 1997
@@ -80,7 +80,7 @@
static struct packet_type p8022_packet_type =
{
- - 0, /* MUTTER ntohs(ETH_P_8022),*/
+ 0, /* MUTTER ntohs(ETH_P_802_2),*/
NULL, /* All devices */
p8022_rcv,
NULL,
diff -u scripts/Configure.orig scripts/Configure
- --- scripts/Configure.orig Mon Feb 10 10:35:20 1997
+++ scripts/Configure Tue Feb 11 08:50:22 1997
@@ -108,7 +108,7 @@
#
function readln () {
if [ "$DEFAULT" = "-d" -a -n "$3" ]; then
- - echo "$1"
+ echo "$1$2 [defaulted]"
ans=$2
else
echo -n "$1"
@@ -288,7 +288,7 @@
def=${old:-$3}
while :; do
readln "$1 ($2) [$def] " "$def" "$old"
- - if expr "$ans" : '0$\|-\?[1-9][0-9]*$' > /dev/null; then
+ if expr "$ans" : '0$\|-?[1-9][0-9]*$' > /dev/null; then
define_int "$2" "$ans"
break
else
- ----------------------------------------------------------------------
Dale
- --
Dale R. Worley Ariadne Internet Services
Voice: +1 617-899-7949 Fax: +1 617-899-7946 E-mail: worley@ariadne.com
"Internet-based electronic commerce solutions to real business problems."
------------------------------
End of linux-kernel-digest V1 #746
**********************************
To subscribe to linux-kernel-digest, send the command:
subscribe linux-kernel-digest
in the body of a message to "Majordomo@Majordomo.vger.rutgers.edu". If you want
to subscribe something other than the account the mail is coming from,
such as a local redistribution list, then append that address to the
"subscribe" command; for example, to subscribe "local-linux-kernel":
subscribe linux-kernel-digest local-linux-kernel@your.domain.net
A non-digest (direct mail) version of this list is also available; to
subscribe to that instead, replace all instances of "linux-kernel-digest"
in the commands above with "linux-kernel".