RE: Stability (2.2.14/15/16/17pre1)

From: George Sexton (gsexton@mhsoftware.com)
Date: Wed Jun 14 2000 - 15:19:30 EST


Andrea,

Perhaps you can help me. I know that you have done a lot of work in the
2.2.x series SMP arena.

I haven't seen a lot of SMP fixes in later kernels (15 and 16). Honestly,
they are still really unstable.

I have personally 2 SMP boxes that lock up with 2.2.16 within 24 hours. With
2.2.14 they run 5-70 days between crashes. In all instances, there is no
OOPs generated from the kernel. I hooked up a serial console and it just
catches nothing. Magic-Sysrq doesn't work. I tried using Ingo Molnar's NMI
Oopser and it didn't seem to generate an oops either.

The strange thing is that most of the time, the systems crash when there is
little or no load. They run very well under a heavy load all day long, and
then crash at night or on weekends.

Do you have some patches that haven't made it into the mainstream kernel
that might help things out? If everything has been incorporated, can you
give me some ideas on where to look in the kernel source?

I am about ready to pitch SMP boxes for Linux. It just isn't working right.
There have been 4 posts to the list this week about SMP instability with
2.2.16 but they have essentially gone un-answered.

Any help would be really appreciated.

Here is a repost of information I put up last week:

Linux version 2.2.16pre7 (root@server.wcon.org) (gcc version egcs-2.91.66
19990314/Linux (egcs-1.1.2 release)) #4 SMP Wed Jun 7 10:04:59 MDT 2000
Intel MultiProcessor Specification v1.1
    Virtual Wire compatibility mode.
OEM ID: OEM00000 Product ID: PROD00000000 APIC at: 0xFEE00000
Processor #0 Pentium(tm) Pro APIC version 17
Processor #1 Pentium(tm) Pro APIC version 17
I/O APIC #2 Version 17 at 0xFEC00000.
Processors: 2
mapped APIC to ffffe000 (fee00000)
mapped IOAPIC to ffffd000 (fec00000)
Detected 400915 kHz processor.
Console: colour VGA+ 80x50
Calibrating delay loop... 799.54 BogoMIPS
Memory: 257680k/262144k available (1132k kernel code, 420k reserved, 2856k
data, 56k init)
Dentry hash table entries: 32768 (order 6, 256k)
Buffer cache hash table entries: 262144 (order 8, 1024k)
Page cache hash table entries: 65536 (order 6, 256k)
Checking 386/387 coupling... OK, FPU using exception 16 error reporting.
Checking 'hlt' instruction... OK.
POSIX conformance testing by UNIFIX
mtrr: v1.35a (19990819) Richard Gooch (rgooch@atnf.csiro.au)
per-CPU timeslice cutoff: 100.09 usecs.
CPU0: Intel Pentium II (Deschutes) stepping 02
calibrating APIC timer ...
..... CPU clock speed is 400.9065 MHz.
..... system bus clock speed is 100.2265 MHz.
Booting processor 1 eip 2000
Calibrating delay loop... 801.18 BogoMIPS
OK.
CPU1: Intel Pentium II (Deschutes) stepping 02
Total of 2 processors activated (1600.72 BogoMIPS).
enabling symmetric IO mode... ...done.
ENABLING IO-APIC IRQs
init IO_APIC IRQs
 IO-APIC (apicid-pin) 2-0, 2-17, 2-19, 2-20, 2-21, 2-22, 2-23 not connected.
NMI Watchdog activated on source IRQ 1
number of MP IRQ sources: 20.
number of IO-APIC #2 registers: 24.
testing the IO APIC.......................

IO APIC #2......
.... register #00: 02000000
....... : physical APIC id: 02
.... register #01: 00170011
....... : max redirection entries: 0017
....... : IO APIC version: 0011
.... register #02: 00000000
....... : arbitration: 00
.... IRQ redirection table:
 NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect:
 00 000 00 1 0 0 0 0 0 0 00
 01 000 00 0 0 0 0 0 1 4 59
 02 0FF 0F 0 0 0 0 0 1 1 51
 03 000 00 0 0 0 0 0 1 1 61
 04 000 00 0 0 0 0 0 1 1 69
 05 000 00 0 0 0 0 0 1 1 71
 06 000 00 0 0 0 0 0 1 1 79
 07 000 00 0 0 0 0 0 1 1 81
 08 000 00 0 0 0 0 0 1 1 89
 09 000 00 0 0 0 0 0 1 1 91
 0a 000 00 0 0 0 0 0 1 1 99
 0b 000 00 0 0 0 0 0 1 1 A1
 0c 000 00 0 0 0 0 0 1 1 A9
 0d 000 00 1 0 0 0 0 0 0 00
 0e 000 00 0 0 0 0 0 1 1 B1
 0f 000 00 0 0 0 0 0 1 1 B9
 10 0FF 0F 1 1 0 1 0 1 1 C1
 11 000 00 1 0 0 0 0 0 0 00
 12 0FF 0F 1 1 0 1 0 1 1 C9
 13 000 00 1 0 0 0 0 0 0 00
 14 000 00 1 0 0 0 0 0 0 00
 15 000 00 1 0 0 0 0 0 0 00
 16 000 00 1 0 0 0 0 0 0 00
 17 000 00 1 0 0 0 0 0 0 00
.................................... done.
checking TSC synchronization across CPUs: passed.
mtrr: your CPUs had inconsistent variable MTRR settings
mtrr: probably your BIOS does not setup all CPUs
PCI: PCI BIOS revision 2.10 entry at 0xfb3f0
PCI: Using configuration type 1
PCI: Probing PCI hardware
PCI->APIC IRQ transform: (B0,I11,P0) -> 16
PCI->APIC IRQ transform: (B0,I17,P0) -> 18
PCI->APIC IRQ transform: (B1,I0,P0) -> 16
Linux NET4.0 for Linux 2.2
Based upon Swansea University Computer Society NET3.039
WAN Router v1.1 (c) 1995-1999 Sangoma Technologies Inc.
NET4: Unix domain sockets 1.0 for Linux NET4.0.
NET4: Linux TCP/IP 1.0 for NET4.0
IP Protocols: ICMP, UDP, TCP, IGMP
TCP: Hash tables configured (ehash 262144 bhash 65536)
Initializing RT netlink socket
Starting kswapd v 1.5
parport0: PC-style at 0x378 [SPP,PS2,EPP]
Detected PS/2 Mouse Port.
Serial driver version 4.27 with<4>Keyboard timeout[2]
 no serial options enabled
ttyS00 at 0x03f8 (irq = 4) is a 16550A
ttyS01 at 0x02f8 (irq = 3) is a 16550A
pty: 256 Unix98 ptys configured
lp0: using parport0 (polling).
Real Time Clock Driver v1.09
PIIX4: IDE controller on PCI bus 00 dev 39
PIIX4: not 100% native mode: will probe irqs later
    ide0: BM-DMA at 0xf000-0xf007, BIOS settings: hda:pio, hdb:pio
hda: Maxtor 91728D8, ATA DISK drive
hdb: ATAPI CD-ROM DRIVE 40X MAXIMUM, ATAPI CDROM drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hda: Maxtor 91728D8, 16479MB w/512kB Cache, CHS=2100/255/63, UDMA
hdb: ATAPI 40X CD-ROM drive, 128kB Cache
Uniform CD-ROM driver Revision: 3.08
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
(scsi0) <Adaptec AIC-7880 Ultra SCSI host adapter> found at PCI 0/11/0
(scsi0) Wide Channel, SCSI ID=7, 16/255 SCBs
(scsi0) Downloading sequencer code... 423 instructions downloaded
scsi0 : Adaptec AHA274x/284x/294x (EISA/VLB/PCI-Fast SCSI) 5.1.28/3.2.4
       <Adaptec AIC-7880 Ultra SCSI host adapter>
scsi : 1 host.
(scsi0:0:0:0) Synchronous at 40.0 Mbyte/sec, offset 8.
  Vendor: IBM Model: DDRS-39130W Rev: S97B
  Type: Direct-Access ANSI SCSI revision: 02
Detected scsi disk sda at scsi0, channel 0, id 0, lun 0
(scsi0:0:3:0) Synchronous at 5.0 Mbyte/sec, offset 15.
  Vendor: DEC Model: DLT2000 Rev: 8B37
  Type: Sequential-Access ANSI SCSI revision: 02
  Vendor: IOMEGA Model: ZIP 100 Rev: J.03
  Type: Direct-Access ANSI SCSI revision: 02
Detected scsi removable disk sdb at scsi0, channel 0, id 5, lun 0
scsi : detected 2 SCSI disks total.
SCSI device sda: hdwr sector= 512 bytes. Sectors= 17850000 [8715 MB] [8.7
GB]
sdb : READ CAPACITY failed.
sdb : status = 1, message = 00, host = 0, driver = 28
sdb : extended sense code = 2
sdb : block size assumed to be 512 bytes, disk size 1GB.
PPP: version 2.3.7 (demand dialling)
TCP compression code copyright 1989 Regents of the University of California
PPP line discipline registered.
eepro100.c:v1.09j-t 9/29/99 Donald Becker
http://cesdis.gsfc.nasa.gov/linux/drivers/eepro100.html
eepro100.c: $Revision: 1.20.2.10 $ 2000/05/31 Modified by Andrey V.
Savochkin <saw@saw.sw.com.sg> and others
eth0: Intel PCI EtherExpress Pro100 82557, 00:90:27:84:9C:17, IRQ 18.
  Receiver lock-up bug exists -- enabling work-around.
  Board assembly 721383-006, Physical connectors present: RJ45
  Primary interface chip i82555 PHY #1.
  General self-test: passed.
  Serial sub-system self-test: passed.
  Internal registers self-test: passed.
  ROM checksum self-test: passed (0x04f4518b).
eepro100.c:v1.09j-t 9/29/99 Donald Becker
http://cesdis.gsfc.nasa.gov/linux/drivers/eepro100.html
eepro100.c: $Revision: 1.20

CONFIG_EXPERIMENTAL=y

CONFIG_M686=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_TSC=y
CONFIG_X86_GOOD_APIC=y
CONFIG_1GB=y
CONFIG_MTRR=y
CONFIG_SMP=y

CONFIG_MODULES=y

CONFIG_NET=y
CONFIG_PCI=y
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_QUIRKS=y
CONFIG_PCI_OLD_PROC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_SYSVIPC=y
CONFIG_SYSCTL=y
CONFIG_BINFMT_AOUT=y
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_MISC=y
CONFIG_PARPORT=y
CONFIG_PARPORT_PC=y

CONFIG_PNP=y

CONFIG_BLK_DEV_FD=y
CONFIG_BLK_DEV_IDE=y
CONFIG_BLK_DEV_IDEDISK=y
CONFIG_BLK_DEV_IDECD=y
CONFIG_BLK_DEV_IDEPCI=y
CONFIG_BLK_DEV_IDEDMA=y
CONFIG_IDEDMA_AUTO=y
CONFIG_PARIDE_PARPORT=y

CONFIG_PACKET=y
CONFIG_NETLINK=y
CONFIG_RTNETLINK=y
CONFIG_NETLINK_DEV=y
CONFIG_FIREWALL=y
CONFIG_FILTER=y
CONFIG_UNIX=y
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
CONFIG_IP_ADVANCED_ROUTER=y
CONFIG_RTNETLINK=y
CONFIG_NETLINK=y
CONFIG_IP_ROUTE_VERBOSE=y
CONFIG_IP_FIREWALL=y
CONFIG_IP_FIREWALL_NETLINK=y
CONFIG_NETLINK_DEV=y
CONFIG_IP_TRANSPARENT_PROXY=y
CONFIG_IP_MASQUERADE=y
CONFIG_IP_MASQUERADE_ICMP=y
CONFIG_IP_MASQUERADE_MOD=y
CONFIG_IP_MASQUERADE_IPAUTOFW=y
CONFIG_SYN_COOKIES=y
CONFIG_SKB_LARGE=y
CONFIG_WAN_ROUTER=y

CONFIG_SCSI=y
CONFIG_BLK_DEV_SD=y
CONFIG_CHR_DEV_ST=m
CONFIG_SCSI_MULTI_LUN=y
CONFIG_SCSI_CONSTANTS=y

CONFIG_SCSI_AIC7XXX=y
CONFIG_AIC7XXX_TCQ_ON_BY_DEFAULT=y
CONFIG_AIC7XXX_CMDS_PER_DEVICE=8
CONFIG_AIC7XXX_RESET_DELAY=5

CONFIG_NETDEVICES=y

CONFIG_DUMMY=m

CONFIG_NET_ETHERNET=y
CONFIG_NET_EISA=y
CONFIG_EEXPRESS_PRO100=y

CONFIG_PPP=y

CONFIG_VT=y
CONFIG_VT_CONSOLE=y
CONFIG_SERIAL=y
CONFIG_SERIAL_CONSOLE=y
CONFIG_UNIX98_PTYS=y
CONFIG_UNIX98_PTY_COUNT=64
CONFIG_PRINTER=y
CONFIG_PRINTER_READBACK=y
CONFIG_MOUSE=y

CONFIG_PSMOUSE=y
CONFIG_82C710_MOUSE=y

CONFIG_WATCHDOG=y

CONFIG_NMI_WATCHDOG=y
CONFIG_NMI_WATCHDOG_IRQ=1
CONFIG_RTC=y

CONFIG_AUTOFS_FS=y
CONFIG_FAT_FS=m
CONFIG_MSDOS_FS=m
CONFIG_VFAT_FS=m
CONFIG_ISO9660_FS=y
CONFIG_JOLIET=y
CONFIG_PROC_FS=y
CONFIG_DEVPTS_FS=y
CONFIG_EXT2_FS=y

CONFIG_SMB_FS=y
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="cp437"
CONFIG_NLS_CODEPAGE_437=m
CONFIG_NLS_CODEPAGE_850=m
CONFIG_VGA_CONSOLE=y
CONFIG_VIDEO_SELECT=y
CONFIG_MAGIC_SYSRQ=y

George Sexton
MH Software, Inc.
Voice: 303 438 9585
http://www.mhsoftware.com

-----Original Message-----
From: owner-linux-kernel@vger.rutgers.edu
[mailto:owner-linux-kernel@vger.rutgers.edu]On Behalf Of Andrea
Arcangeli
Sent: Tuesday, June 13, 2000 5:19 PM
To: Marcelo Tosatti
Cc: Rodrigo Barbosa; linux-kernel@vger.rutgers.edu
Subject: Re: Stability (2.2.14/15/16/17pre1)

On Tue, 13 Jun 2000, Marcelo Tosatti wrote:

>Running memtest on 2.2.15 caused the kernel to kill the
>mmap02 program _fast_. 2.2.16 is able to run all memtest programs
>without problems. Looking at 2.2.15's page_alloc.c:

So it was the swap_out that was causing the system to think some memory
was released while it wasn't. I never had problems but I guess the
no-swapout patch could have hided the problem very well and that's why I
could have not noticed that.

> if (nr_free_pages >= freepages.high)
> {
> /* share RO cachelines in fast path */
> if (current->trashing_mem)
> current->trashing_mem = 0;
> goto ok_to_allocate;
> }
> else
> {
> if (nr_free_pages < freepages.low)
> wake_up_interruptible(&kswapd_wait);
> if (nr_free_pages > freepages.min &&
!current->trashing_mem)
> goto ok_to_allocate;
> }
>
> current->trashing_mem = 1;
> current->flags |= PF_MEMALLOC;
> freed = try_to_free_pages(gfp_mask);
> current->flags &= ~PF_MEMALLOC;
>
>As i see here, a process which has trashing_mem == 1 will call
>do_try_to_free_pages() even if nr_free_pages == freepages.high.
>Is that a correct behaviour?

It's definitely correct, that was a _feature_ that is been dropped by
mistake IMHO.

The cure for the probably you mentioned is very away from page_alloc.c,
I think this was the cure:

                /* Then, try to page stuff out.. */
+ swapcount = flushcount;
                while (swap_out(priority, gfp_mask)) {
- if (!--count)
- goto done;
+ if (!--swapcount)
+ break;
                }

Andrea

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Jun 15 2000 - 21:00:33 EST