2.5.19 (and earlier) IDE (+EXT3+???) bugs

From: David Brownell (david-b@pacbell.net)
Date: Fri May 31 2002 - 15:28:31 EST


I'm trying to use a slightly elderly laptop for some testing
on the 2.5 kernels. (As an ultralight, it's got some hardware
that tweaks some interesting USB/APM codepaths that don't
otherwise show up.) It's run Linux (mostly) since I got it,
and I can install and use RH 7.3 on it, no troubles.

But it doesn't seem to want to run any recent 2.5 kernels.
I first tried with 2.5.15, and kernels up to and including
the latest (2.5.19 as I write) have the same overall failure
mode ... which does not happen with any of the 2.4 kernels
I've tried. (No recent ones other than the RH 7.3 code,
but many earlier ones.) Basically, I see:

- kernel loads OK ... I attach "dmesg" output.
- runs init, which runs init scripts.
- everything's fine, disk fscks as right, UNTIL ...
- ...it blows up when remounting the root filesystem r/w
   * Takes a *long* time, if it even succeeds
   * Most of that time is evidently used to scribble over
     as much of the disk as it can!
   * If I powerdown the system very quickly, "fsck" can mostly
     recover. If not, then both root and /boot get trashed.
- Next step is to re-install the OS again.

As a stock RH 7.3 install, this root filesystem uses ext3.

I was able to boot with "init=/bin/sh" and do some basic
testing with a read-only root FS. Reading files works ok,
"hdparm -I" gives the same info it did under the RH7.3 kernel,
and I can use DD to read and write to the disk. (USB works OK;
I can bring it up by hand using the "ohci-hcd" driver, which
is how I could transfer the dmesg info off this system.)

So far the only really suggestive thing I've come up with is
that if I do much disk I/O, I start to see "hda: lost interrupt"
and the operation seems to become timeout-driven. I first
noticed that with DD, but then "fsck" of the root FS (5+MBytes)
turned up the same failure. (The fsck took so much time I had
to kill it; running on 2.4, it quickly reported no problems.)

Does anyone know what might be going on here? Or better yet,
have a fix to whatever it is that's wrong? :) Seems to me there
is a clear IDE problem: lost interrupts were not an issue on
the 2.4 kernels. Whether fixing that would make that "scribble
on the disk" problem go away, I couldn't say.

- Dave

p.s. Hardware is a Toshiba Portege 3020ct, pci host bridge
      is a "Toshiba America Info Systems 601 (rev a2)"
      according to lspci.

Linux version 2.5.19 (root@neon) (gcc version 2.96 20000731 (Red Hat Linux 7.3
2.96-110)) #1 Thu May 30 19:40:52 PDT 2002
Video mode to be used for restore is f03
BIOS-provided physical RAM map:
  BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
  BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
  BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
  BIOS-e820: 0000000000100000 - 0000000004010000 (usable)
  BIOS-e820: 0000000004010000 - 0000000004020000 (ACPI data)
  BIOS-e820: 0000000004020000 - 0000000004040000 (reserved)
  BIOS-e820: 00000000fef80000 - 00000000ff000000 (reserved)
  BIOS-e820: 00000000ffee0000 - 00000000ffee6e00 (reserved)
  BIOS-e820: 00000000ffee6e00 - 00000000ffee7000 (ACPI NVS)
  BIOS-e820: 00000000ffee7000 - 00000000ffef0000 (reserved)
  BIOS-e820: 00000000ffff0000 - 0000000100000000 (reserved)
64MB LOWMEM available.
On node 0 totalpages: 16400
zone(0): 4096 pages.
zone(1): 12304 pages.
zone(2): 0 pages.
Kernel command line: init=/bin/sh ro root=/dev/hda2 vga=0x0f03
Initializing CPU#0
Detected 299.947 MHz processor.
Console: colour VGA+ 80x28
Calibrating delay loop... 598.01 BogoMIPS
Memory: 62912k/65600k available (1004k kernel code, 2300k reserved, 244k data,
216k init, 0k highmem)
Dentry-cache hash table entries: 16384 (order: 5, 131072 bytes)
Inode-cache hash table entries: 8192 (order: 4, 65536 bytes)
Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
CPU: Before vendor init, caps: 008001bf 00000000 00000000, vendor = 0
Intel Pentium with F0 0F bug - workaround enabled.
CPU: After vendor init, caps: 008001bf 00000000 00000000 00000000
CPU: After generic, caps: 008001bf 00000000 00000000 00000000
CPU: Common caps: 008001bf 00000000 00000000 00000000
CPU: Intel Mobile Pentium MMX stepping 02
Checking 'hlt' instruction... OK.
POSIX conformance testing by UNIFIX
Linux NET4.0 for Linux 2.4
Based upon Swansea University Computer Society NET3.039
Initializing RT netlink socket
PCI: PCI BIOS revision 2.10 entry at 0xfd84f, last bus=21
PCI: Using configuration type 1
isapnp: Scanning for PnP cards...
isapnp: No Plug & Play device found
PnPBIOS: Found PnP BIOS installation structure at 0xc00f9020
PnPBIOS: PnP BIOS version 1.0, entry 0xf0000:0x9563, dseg 0x0
PnPBIOS: 17 nodes reported by PnP BIOS; 17 recorded by driver
PnPBIOS: PNP0c02: ioport range 0x1882-0x1885 has been reserved
PCI: Probing PCI hardware
PCI: Probing PCI hardware (bus 00)
apm: BIOS version 1.2 Flags 0x02 (Driver version 1.16)
Starting kswapd
BIO: pool of 256 setup, 14Kb (56 bytes/bio)
biovec: init pool 0, 1 entries, 12 bytes
biovec: init pool 1, 4 entries, 48 bytes
biovec: init pool 2, 16 entries, 192 bytes
biovec: init pool 3, 64 entries, 768 bytes
biovec: init pool 4, 128 entries, 1536 bytes
biovec: init pool 5, 256 entries, 3072 bytes
Journalled Block Device driver loaded
pty: 512 Unix98 ptys configured
Real Time Clock Driver v1.11
block: 256 slots per queue, batch=32
Floppy drive(s): fd0 is 1.44M
FDC 0 is an 8272A
ATA/ATAPI device driver v7.0.0
ATA: PCI bus speed 33.3MHz
hda: TOSHIBA MK6411MAT, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
  hda: 12685680 sectors, CHS=13424/15/63
  hda: [PTBL] [789/255/63] hda1 hda2 hda3
mice: PS/2 mouse device common for all mice
NET4: Linux TCP/IP 1.0 for NET4.0
IP Protocols: ICMP, UDP, TCP
IP: routing cache hash table of 512 buckets, 4Kbytes
TCP: Hash tables configured (established 4096 bind 4096)
NET4: Unix domain sockets 1.0/SMP for Linux NET4.0.
kjournald starting. Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.
VFS: Mounted root (ext3 filesystem) readonly.
Freeing unused kernel memory: 216k freed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri May 31 2002 - 22:00:31 EST