A new crashme 2.1.38 was Re: i386 memory corrupted by free_initmem()

Wolfgang Wander (wwc@lars.desy.de)
Fri, 16 May 1997 16:59:56 +0200


Andries.Brouwer@cwi.nl writes:
> In recent kernels I noticed some strange memory corruption.
> This turns out to be caused by free_initmem() in
> arch/i386/mm/init.c. (It frees too much.)
> The easiest fix is to comment out this entire routine.
> I was too lazy too think about what the proper rounding is
> in case the routine is not commented out.

Hi,

can confirm this one. In fact we got this kind of error plus several
other ones in 2.1.37 when dumping core of very large processes. (see
my last crashme on this list ;-)

(Un)fortunately this problem is now solve by locking the kernel in
2.1.38 during the core generation (IMHO just curing the problem not
the cause):
--from patch-2.1.38.gz------------------------------------------------
--- v2.1.37/linux/arch/i386/kernel/signal.c Tue May 13 22:41:01 1997
+++ linux/arch/i386/kernel/signal.c Thu May 15 12:24:24 1997
@@ -392,10 +392,12 @@

case SIGQUIT: case SIGILL: case SIGTRAP:
case SIGABRT: case SIGFPE: case SIGSEGV:
+ lock_kernel();
if (current->binfmt && current->binfmt->core_dump) {
if (current->binfmt->core_dump(signr, regs))
signr |= 0x80;
}
+ unlock_kernel();
/* fall through */
default:
spin_lock_irq(&current->sigmask_lock);

----------------------------------------------------------------------
Due to this fix the reproduction of the list corruption is more
difficult but are we mice ore men? ;-)

Here is a new crashme.c for 2.1.38
--------------------------------------------------------------
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

/*
This program crashes either linux-2.1.38 in SMP mode
or it falls in uninterruptible sleep.

It forks to generate two processes
and then starts to generate a 500MB file (twice)
after that it copies this file to another one until
the uptime of the system ends
*/

#define scratch "/data2/tmp" /* use some filesystem with lots of
space here */
#define filesize 500000000 /* size of the files to be generated */

int
main()
{
int fd,fd2; /* file descriptor */
char buf[65536];
int count; /* total bytes written/read */
int bytes; /* bytes written/read */
char filename[128]; /* filename to read/write */
char filename2[128]; /* filename to read/write */
int id; /* just to know who we are*/
int pass;

id = fork();

/* now there are two of us */
if( id < 0 ) {
perror("fork()");
exit(1);
}
if( id != 0 ) {
sleep(20);
id = 1; /* the father */
}

sprintf(filename,"%s/testfile%d",scratch, id);
sprintf(filename2,"%s/copyfile%d",scratch, id);

memset( buf, 1, sizeof(buf)); /* most important data */

fd = creat(filename,0644);
if( fd < 0 ) {
perror("creat()");
exit(1);
}
printf("%d: Start writing\n", id);
for( count = 0; count < filesize; count += bytes ) {
bytes = write( fd, buf, sizeof(buf));
if( bytes <= 0 ) {
perror("write()");
exit(1);
}
}
close(fd);

for( pass = 0; ; pass++ ) {
fd = open( filename, O_RDONLY);
if( fd < 0 ) {
perror("open()");
exit(1);
}
fd2 = creat(filename2,0644);
if( fd < 0 ) {
perror("creat()");
exit(1);
}
printf("%d: Start copying pass %d\n", id, pass);
for( count = 0; count < filesize; count += bytes ) {
bytes = read( fd, buf, sizeof(buf));
if( bytes <= 0 ) {
perror("read()");
exit(1);
}
write( fd2, buf, bytes); /* neglect errors for now */
}
close(fd);
close(fd2);
}
return 0;
}
------------------------------------------------------------

This crashme is however not very effective. On our ten machines
I needed it to run for 4 hours to receive 2 oopses, 5 silent locks and
7 uninterruptiple idle processes.

First the oopses:

Oops1
------------------------------------------------------------
Oops: 0002
CPU: 0
EIP: 0010:[<c01285aa>]
EFLAGS: 00010283
eax: 00000000 ebx: c6710840 ecx: c6710840 edx: 00000000
esi: c6710840 edi: 00000811 ebp: 0040e850 esp: c6fc5e98
ds: 0018 es: 0018 ss: 0018
Process crashme (pid: 185, process nr: 30, stackpage=c6fc5000)
Stack: c6fc5f20 000652d8 00000000 c6f88dc0 00000000 00000811 00000100 c6710840
c5100811 00000000 c6710840 c6710840 c6710840 c6710840 c6710840 c6710840
c6710840 c6710840 c6710840 c6710840 c6710840 c6710840 c79dc90c c6710840
Call Trace:
Code: 89 5c ac 28 45 89 5c 24 10 8b 5b 10 85 db 0f 85 4a fe ff ff
------------------------------------------------------------
Using `System.map' to map addresses to symbols.

>>EIP: c01285aa <brw_page+24a/370>

Code: c01285aa <brw_page+24a/370>
------------------------------------------------------------
0xc0128578 <brw_page+536>: movl %eax,0x10(%esp,1)
0xc012857c <brw_page+540>: testl %eax,%eax
0xc012857e <brw_page+542>: je 0xc012846d <brw_page+269>
0xc0128584 <brw_page+548>: pushl %eax
0xc0128585 <brw_page+549>: call 0xc0127f04 <__brelse>
0xc012858a <brw_page+554>: addl $0x4,%esp
0xc012858d <brw_page+557>: jmp 0xc012846d <brw_page+269>
0xc0128592 <brw_page+562>: movl %esi,%esi
0xc0128594 <brw_page+564>: cmpl $0x0,0x4c(%esp,1)
0xc0128599 <brw_page+569>: jne 0xc01285a4 <brw_page+580>
0xc012859b <brw_page+571>: lock btrl $0x0,0x14(%ebx)
0xc01285a1 <brw_page+577>: jmp 0xc01285aa <brw_page+586>
0xc01285a3 <brw_page+579>: nop
0xc01285a4 <brw_page+580>: lock btsl $0x1,0x14(%ebx)
**> 0xc01285aa <brw_page+586>: movl %ebx,0x28(%esp,%ebp,4)
0xc01285ae <brw_page+590>: incl %ebp
0xc01285af <brw_page+591>: movl %ebx,0x10(%esp,1)
0xc01285b3 <brw_page+595>: movl 0x10(%ebx),%ebx
0xc01285b6 <brw_page+598>: testl %ebx,%ebx
------------------------------------------------------------
"fs/buffer.c" line 1215
if (rw == READ)
clear_bit(BH_Uptodate, &next->b_state);
else
set_bit(BH_Dirty, &next->b_state);
**> arr[nr++] = next;
------------------------------------------------------------
Seems to indicate again a list corruption in fs/buffer.c

Oops2
------------------------------------------------------------
Oops: 0000
CPU: 1
EIP: 0010:[<c0111761>]
EFLAGS: 00010212
eax: 01010101 ebx: c7bfd740 ecx: 00000811 edx: 01010101
esi: c1b145c4 edi: c45b8000 ebp: c45b9c78 esp: c45b9c6c
ds: 0018 es: 0018 ss: 0018
Process crashme (pid: 1260, process nr: 34, stackpage=c45b9000)
Stack: c7bfd740 c7bfd804 00000020 c4e9dc70 c016172c c1b145c8 c7bfd740 c0171421
c7bfd740 00000000 c4e9dc34 00000034 c7bfd740 c01718ca c4e9dc70 c7bfd740
00000034 c7bfd740 00000000 c4e9dc70 c4e9dc34 fcec8c54 00000000 00000000
Call Trace: [<c016172c>] [<c0171421>] [<c01718ca>] [<c0176129>] [<c01763e2>] [<c0160034>] [<c0169c94>]
[<c0169e48>] [<c0163c84>] [<c0118461>] [<c010b4f5>] [<c011040c>] [<c01099c0>] [<c0111630>] [<c0106000>]
[<c01acb96>] [<c0129b11>] [<c0141b36>] [<c01278af>] [<c01421f0>] [<c0142515>] [<c0140801>] [<c010b34d>]
[<c01b00c8>] [<c01886ce>] [<c011cede>] [<c0125446>] [<c01098da>]
Code: 8b 02 83 f8 01 75 5c 9c 5b fa f0 0f ba 2d c4 1f 1e c0 00 0f
------------------------------------------------------------
>>EIP: c0111761 <wake_up_interruptible+41/bc>
Trace: c016172c <sock_def_callback2+18/2c>
Trace: c0171421 <tcp_data+c1/d0>
Trace: c01718ca <tcp_rcv_established+31e/3c4>
Trace: c0176129 <tcp_v4_do_rcv+51/f8>
Trace: c01763e2 <tcp_v4_rcv+212/26c>
Trace: c0160034 <sys_getsockopt+98/184>
Trace: c0169c94 <ip_local_deliver+124/178>
Trace: c0169e48 <ip_rcv+160/18c>
Trace: c0163c84 <net_bh+128/160>
Trace: c0118461 <do_bottom_half+85/a4>
Trace: c010b4f5 <do_IRQ+18d/198>
Trace: c011040c <smp_apic_timer_interrupt+20/24>
Trace: c01099c0 <ret_from_intr>
Trace: c0111630 <schedule+318/324>
Trace: c0106000 <init_task_union>
Trace: c01acb96 <scsi_do_cmd+44a/478>
Trace: c0129b11 <__wait_on_super+79/d8>
Trace: c0141b36 <ext2_alloc_block+22/19c>
Trace: c01278af <getblk+2f/36c>
Trace: c01421f0 <block_getblk+160/264>
Trace: c0142515 <ext2_getblk+221/22c>
Trace: c0140801 <ext2_file_write+1d1/4cc>
Trace: c010b34d <__global_restore_flags+3d/58>
Trace: c01b00c8 <do_sd_request+34/1b0>
Trace: c01886ce <unplug_device+3e/48>
Trace: c011cede <generic_file_read+32e/650>
Trace: c0125446 <sys_write+116/18c>
Trace: c01098da <system_call+3a/40>

Code: c0111761 <wake_up_interruptible+41/bc>
------------------------------------------------------------
je 0xc01117c8 <wake_up_interruptible+168>
0xc0111748 <wake_up_interruptible+40>: leal 0xfffffffc(%eax),%esi
0xc011174b <wake_up_interruptible+43>: cmpl %esi,%ecx
0xc011174d <wake_up_interruptible+45>:
je 0xc01117c8 <wake_up_interruptible+168>
0xc011174f <wake_up_interruptible+47>: movl %esp,%eax
0xc0111751 <wake_up_interruptible+49>: movl %eax,%edi
0xc0111753 <wake_up_interruptible+51>: andl $0xffffe000,%edi
0xc0111759 <wake_up_interruptible+57>: leal 0x0(%esi),%esi
0xc011175c <wake_up_interruptible+60>: movl (%ecx),%edx
0xc011175e <wake_up_interruptible+62>: movl 0x4(%ecx),%ecx
**> 0xc0111761 <wake_up_interruptible+65>: movl (%edx),%eax
0xc0111763 <wake_up_interruptible+67>: cmpl $0x1,%eax
0xc0111766 <wake_up_interruptible+70>:
jne 0xc01117c4 <wake_up_interruptible+164>
0xc0111768 <wake_up_interruptible+72>: pushf
0xc0111769 <wake_up_interruptible+73>: popl %ebx
0xc011176a <wake_up_interruptible+74>: cli
0xc011176b <wake_up_interruptible+75>: lock btsl $0x0,0xc01e1fc4
0xc0111774 <wake_up_interruptible+84>:
------------------------------------------------------------

void wake_up_interruptible(struct wait_queue **q)
{
struct wait_queue *next;

read_lock(&waitqueue_lock);
if (q && (next = *q)) {
struct wait_queue *head;

head = WAIT_QUEUE_HEAD(q);
while (next != head) {
struct task_struct *p = next->task;
next = next->next;
**> if (p->state == TASK_INTERRUPTIBLE)
wake_up_process(p);
}
}
read_unlock(&waitqueue_lock);
}
------------------------------------------------------------
Some task list corruption?

The silent lockups are not very informative by definition. I monitor
the systems via serial line so the possibility to lose date in the
syslog is rather small.

Seven time however the tasks itself went into a strange state of
uninterruptible sleep. They are no more killable (I guess thats what
uninterruptible means) and produce a constant load of 1 without cpu
consumption.

Some /proc information from these processes (still the above crashme
code):
> cat maps
08048000-08049000 r-xp 00000000 00:02 553980166
08049000-0804a000 rw-p 00000000 00:02 553980166
40000000-40005000 r-xp 00000000 08:01 79902
40005000-40007000 rw-p 00004000 08:01 79902
40007000-40009000 rw-p 00000000 00:00 0
4000a000-40093000 r-xp 00000000 08:01 79886
40093000-40099000 rw-p 00088000 08:01 79886
40099000-400cb000 rw-p 00000000 00:00 0
bffef000-c0000000 rwxp ffff0000 00:00 0

> cat stat
221 (crashme) D 220 220 201 1216 201 64 23 0 1574989 0 90 43344 0 0 0 0 0 0 99127 905216 66 2147483647 134512640 134515234 3221224088 3
221158216 1073980704 0 0 2147483648 0 3222391632 0 0

> cat statm
66 66 43 2 0 64 29

> cat status
Name: crashme
State: D (disk sleep)
Pid: 221
PPid: 220
Uid: 2600 2600 2600 2600
Gid: 100 100 100 100
VmSize: 884 kB
VmLck: 0 kB
VmRSS: 264 kB
VmData: 208 kB
VmStk: 68 kB
VmExe: 4 kB
VmLib: 568 kB
SigPnd: 00000000
SigBlk: 00000000
SigIgn: 80000000
SigCgt: 00000000

----------------------------------------------------------------------

All this behaviour is only 'available' in SMP mode.

In our data production we got lots of more crashlogs and oopses which
I sent to Linus already - the programs causing this problems have a
similar structure as the above crashme.c but are a litte ;-) more
complex.

Is this helpful to anyone out there?

Wolfgang

--
   _/  _/ _/  _/ _/_/_/ . Wolfgang Wander HERMES Collaboration 
  _/  _/ _/  _/ _/     . DESY Hamburg    Email: Wolfgang.Wander@desy.de
 _/_/_/ _/_/_/ _/     . Notkestr. 85    Tel: +49 40 8998 4638  Fax: -4034
_/_/_/ _/_/_/ _/_/_/ . 22603 Hamburg   http://www-hermes.desy.de/wander.html