Bad TTY Magic Number -> Memory Initialization Problem?

Terence Murphy (tsmurphy@ichips.intel.com)
Fri, 30 Oct 1998 09:34:09 -0800 (PST)


Hi Folks,

I've been working for a while getting Linux (2.1.126) to boot on a Quad
Xeon with no BIOS and no devices besides a timer, PIC, and serial port. I
get it to boot fairly far into the process, and to actually run programs,
but I experience intermittent problems, I believe possibly due to memory
courruption or lack of initialization caused in the boot process.

For the record, I am running GCC 2.7.2.3 and BINUTILS 2.9.1.

Let me first describe a symptom of this. When booting I get the following
kernel message:

Warning: dev (17:d0) tty->count(2116024452) != #fd's(1) in tty_open
Warning: bad magic number for tty struct (04:41) in release_dev
Warning: unable to open an initial console.

This is intermittent, though. I got this several times, but now I only
get the final line (with no further description). I actually got the
console to work exactly one time, but then the kernel immeditalely
panicked and crashed.

I have experienced other symptoms which I believe are related:

For a while, the function HANDLE_IRQ_EVENT() was hanging in the
IRQ_ENTER() macro the first time it was called. I solved this by
initializing the variable GLOBAL_IRQ_LOCK to 0 (in
arch/i386/kernel/setup.c). This is a static variable, so it should
have been initialized to 0, that's why I suspect it might be an
initialization (or a corruption) problem.

After this, I was experiencing a problem where KMEM_CREATE would
think it was being called during an interrupt. I resolved this,
again, but initializing teh array LOCAL_IRQ_COUNT to all 0's.
So I suspect it might be an initialization/corruption problem.

Let me describe my setup / boot process:

I am using a mostly stock 2.1.126 kernel. The main changes I have
made are:

(A) I don't have run arch/i386/boot/bootsect.S (I jump directly
to the setup)
(B) In arch/i386/boot/setup.S, I commented out all of the BIOS calls
to get memory size, drive configuration, APM, MCA, etc. I fill
in the memory size manually for now.
(C) In arch/i386/kernel/head.S, I initialize the local APIC (since
there's no BIOS) and initialize all memory to 0 (except for the
kernel image, setup program, and RAM disk)
(D) The initialization described above (which I put into
arch/i386/kernel/setup.c)
(E) I am loading the kernel image at 0x100000, the setup at 0x90000,
the INITRD ram disk at 0x2000000. (setup = first 2k of bzImage,
image = rest of bzImage, INITRD = ram disk created with mke2fs).

Also, I have heard about some corruption problems with SMP. I'm
not sure if this is an issue for this. Linux thinks the Quad Xeon
in a single cpu Celeron. ;-) I would like to try to build a kernel
without SMP defined (I'm not sure how to do this) to be sure.

I'm looking for suggestions on how to attack this problem. It may be
a BIOS issue (some BIOS initialization which Linux depends on, which
I haven't done). I can't think of any issues for this.

It seems highly likely that it is a memory corruption issue caused by
the kernel. How would I go about tracking this down?

I'd also like to try an older kernel. However, I depend on both
INITRD and a serial console, which I believe were both introduced
recently. Is there a more stable kernel which I can try which has
all those?

Thanks in advance for any help / suggestions / direction.

Regards,

Terry Murphy (speaking only for myself)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/