spoiling the party --- slab corruption in 2.3.99-pre3

From: Andreas Bombe (andreas.bombe@munich.netsurf.de)
Date: Wed Apr 05 2000 - 17:38:30 EST


I experienced quite some problems with slab (inode_cache) corruption
in 2.3.99-pre3 today. The hardware is K6/233 96MB SCSI etc. nothing
special, nothing overclocked, serving me for years without hardware
hiccups (except for once when a drive cable blocked the CPU fan, but
that's a different story).

This happened while doing a tar backup to tape. The tar output goes
through a buffer program and I don't think the data went to tape yet.
I.e. with lots of small files in the root partition being piped to a
buffer program there was a lot of file activity in quite a short time
(possibly triggering a race).

Excerpts from the kernel logs:

Apr 5 19:36:16 storm kernel: sym53c875-0-<3,*>: FAST-10 SCSI 10.0 MB/s (100 ns, offset 8)
Apr 5 19:36:16 storm kernel: st0: Block limits 1 - 16777215 bytes.
Apr 5 19:37:54 storm kernel: kmem_alloc: Bad slab magic (corrupt) (name=inode_cache)
Apr 5 19:38:25 storm last message repeated 1413 times
Apr 5 19:38:48 storm last message repeated 1042 times
Apr 5 19:38:48 storm kernel: socket: no more sockets
Apr 5 19:38:48 storm kernel: kmem_alloc: Bad slab magic (corrupt) (name=inode_cache)
Apr 5 19:38:56 storm last message repeated 346 times

You get the point. Running programs still ran (gpm stopped working
somewhere though), new programs could hardly be executed because
either the binary or shared libraries couldn't be loaded, not even
stat()ed - trying to gave more slab errors.

A few worked, ls for example. shutdown didn't, so I used magic sysrq
to unmount & reboot. Before I did a showmem sysrq which gave:

Apr 5 21:07:11 storm kernel: SysRq: Show Memory
Apr 5 21:07:11 storm kernel: Mem-info:
Apr 5 21:07:11 storm kernel: Free pages: 45052kB ( 0kB HighMem)
Apr 5 21:07:11 storm kernel: ( Free: 11263, lru_cache: 9424 (192 384 576) )
Apr 5 21:07:11 storm kernel: DMA: 57*4kB 42*8kB 22*16kB 16*32kB 13*64kB 13*128kB 12*256kB 4*512kB 4*1024kB 0*2048kB = 13140kB)
Apr 5 21:07:11 storm kernel: Normal: 780*4kB 711*8kB 392*16kB 188*32kB 89*64kB 16*128kB 4*256kB 2*512kB 1*1024kB 0*2048kB = 31912kB)
Apr 5 21:07:11 storm kernel: HighMem: = 0kB)
Apr 5 21:07:11 storm kernel: Swap cache: add 2629, delete 1141, find 1893/2496
Apr 5 21:07:11 storm kernel: Free swap: 97096kB
Apr 5 21:07:11 storm kernel: 24576 pages of RAM
Apr 5 21:07:11 storm kernel: 0 pages of HIGHMEM
Apr 5 21:07:11 storm kernel: 1034 reserved pages
Apr 5 21:07:11 storm kernel: 7898 pages shared
Apr 5 21:07:11 storm kernel: 1488 pages swap cached
Apr 5 21:07:11 storm kernel: 0 pages in page table cache
Apr 5 21:07:11 storm kernel: Buffer memory: 2048kB

System logging continued normally, I pasted the excerpts from the
logfiles on disk. Trying to do the same backup after rebooting
succeeded without problems.

-- 
 Andreas E. Bombe <andreas.bombe@munich.netsurf.de>    DSA key 0x04880A44
http://home.pages.de/~andreas.bombe/    http://linux1394.sourceforge.net/

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Apr 07 2000 - 21:00:15 EST