dump device, reboot on kernel panic

John Daley (johnd@virtual-impact.com)
Fri, 06 Nov 1998 19:28:57 -0800


Hi,

I've got some questions and issues relating to the kernel
about how to maintain linux boxes remotely. Hope this isn't
too off-topic for this group. Also I have a specific kernel
problem at the bottom of this post I was hoping to get some
help on. Thanks!

Background:

Our company supports a bunch of cheapo pentium boxes scattered
about the coutry running linux. They are maintained remotely
through either ppp or idsn connetions. This works great except for
the rare kernel panic that leaves the machine in a state where we
can't dial into it. In order to reduce the occurances of this
happening, and increase our ability to maintian machines remotely,
we've done a couple tricks like put linux on a small secondary
partition that we can boot into in case of trouble. I hacked
rc.sysint to delay going into repair mode if a disk doesn't pass
fsck until after networking is up. then we can then dial in and
boot to the alternate root and fsck all the primary filesystems.
UPS's and raid systems are outside of our budget contraints. We've
also looked at several dial-into-something-to-cause-a-hard-reboot
devices, but they were either too expensive, or didn't work
(suggestions welcome). I guess hardware watchdog boards would be
an option too, but haven't looked into that.

Some ideas:

In order to make a linux be 100% maintainable remotely (until a hard
failure occurs of course), what do ya'll think about these ideas
(maybe they are already available?):

(1) Have a 'reboot on panic' capability. Optionally booting into
another partition. Switching between the modes through /proc er so.

(2) Have the ability to write a system image to a pre-defined dump
partition when the kernel panics.

(3) Journaled filesystem

I worked on AIX a while back and it has the ability to do (1) and (2). A
reboot capability from a kernel crash may get it limping along enough to
dial into it and fix whatever's broke. A system image was an absolute
necessity for debugging hard/impossible to reproduce kernel problems
on AIX. We would have customers 'dd' the dump device to a tape or
whatever and send it, the kernel and the system map in. then you really
have everything you need to figure out what happened. It could make
quick work out of fixing nasty deadlock and timing problems as I'm sure
are coming up in the Linux SMP work being done. How hard would it be to
do (1) and (2)? Does it make sense to do either of these for for linux?

I understand someone (at RedHat?) may be working on a journaled
filesystem
for linux. Any ETA on this?

And now for a real kernel problem....:

I have a very old kernel (2.0.18) running redhat 4.0 that crashes
sometimes. First off, can I upgrade kernel version 2.0.35 with
updating libc or anything? I can't reproduce the problem (see
background above). The crash is at get_empty_inode+68/324. I know
nothing about this fs code nor 586 assembler. I think it died
trying to reference inode->i_count at line 493 in inode.c, which
indicates a corrupt inode list. Suggestions on how to proceed?
Note, I can't just upgrade becuase I don't have physical access
to the machine. I could upgrade the kernel remotely if that
doesn't cause probs. Can any underlying problem be guessed at
form this data (bad disk etc?).

--------- machine info ----------
# ./getinfo
-- Versions installed: (if some fields are empty or looks
-- unusual then possibly you have very old versions)
Linux arkchild-site 2.0.18 #1 Tue Oct 22 14:28:15 EDT 1996 i586
Kernel modules 2.0.0
./getinfo: gcc: command not found
Gnu C
Linux C Library 5.3.12
Dynamic Linker (ld.so) 1.7.14
ls: /usr/lib/libg++.so: No such file or directory
Procps 1.01
Mount 2.5l
Net-tools 1.32-alpha
Kbd 0.89
Sh-utils 1.12

32MB memory
isks:
hda: QUANTUM FIREBALL_TM3840A, 3681MB w/76kB Cache, LBA, CHS=93 5/128/63
hdb: QUANTUM FIREBALL_TM3840A, 3681MB w/76kB Cache, LBA, CHS=74 80/16/63

-------- from /var/log/messages ----------

general protection: 0000
CPU: 0
EIP: 0010:[get_empty_inode+68/324]
EFLAGS: 00010202
eax: 00000200 ebx: 0008f5b8 ecx: 890e89e4 edx: 00000001
esi: 00000034 edi: 00000001 ebp: 001f8a08 esp: 01a27eb4
ds: 0018 es: 0018 fs: 002b gs: 002b ss: 0018
Process pidof (pid: 27781, process nr: 8, stackpage=01a27000)
Stack: 00000000 001e5cfc 00000000 00121c45 001c2614 001c2614 001f8a08
019f000c
019f0006 00162507 001f8a08 6c85000a 00000001 001c2614 01ead0f4
00000007
00162e60 001f8a08 6c85000a 001c2614 01ead0f4 00000001 01a27f60
00000007
Call Trace: [__iget+97/488]
[proc_get_inode+23/128]
[proc_lookup+280/316]
[lookup+217/240]
[open_namei+512/1000]
[do_open+87/284]
[sys_open+57/112]
[system_call+85/128]
Code: 66 83 79 78 00 75 1c ba e7 03 00 00 8a 41 7c 0a 41 7d 75 03

------------ Dump of assembler code for function get_empty_inode
------------
0x120b5c <get_empty_inode>: pushl %edi
0x120b5d <get_empty_inode+1>: pushl %esi
0x120b5e <get_empty_inode+2>: pushl %ebx
0x120b5f <get_empty_inode+3>: movl 0x1a4330,%eax
0x120b64 <get_empty_inode+8>: cmpl %eax,0x1a4338
0x120b6a <get_empty_inode+14>: jle 0x120b7c <get_empty_inode+32>
0x120b6c <get_empty_inode+16>: sarl $0x1,%eax
0x120b6f <get_empty_inode+19>: cmpl %eax,0x1a4334
0x120b75 <get_empty_inode+25>: jnl 0x120b7c <get_empty_inode+32>
0x120b77 <get_empty_inode+27>: call 0x12040c <grow_inodes>
0x120b7c <get_empty_inode+32>: movl 0x1b3bc4,%ecx
0x120b82 <get_empty_inode+38>: xorl %ebx,%ebx
0x120b84 <get_empty_inode+40>: movl $0x3e8,%edi
0x120b89 <get_empty_inode+45>: movl 0x1a4330,%eax
0x120b8e <get_empty_inode+50>: shrl $0x1f,%eax
0x120b91 <get_empty_inode+53>: addl 0x1a4330,%eax
0x120b97 <get_empty_inode+59>: movl %eax,%esi
0x120b99 <get_empty_inode+61>: sarl $0x1,%esi
0x120b9c <get_empty_inode+64>: testl %esi,%esi
0x120b9e <get_empty_inode+66>: jle 0x120bcb <get_empty_inode+111>
0x120ba0 <get_empty_inode+68>: cmpw $0x0,0x78(%ecx)
0x120ba5 <get_empty_inode+73>: jne 0x120bc3 <get_empty_inode+103>
0x120ba7 <get_empty_inode+75>: movl $0x3e7,%edx
0x120bac <get_empty_inode+80>: movb 0x7c(%ecx),%al
0x120baf <get_empty_inode+83>: orb 0x7d(%ecx),%al

--------- source from inode.c ---------
489 inode = first_inode;
490 best = NULL;
491 badness = 1000;
492 for (i = nr_inodes/2; i > 0; i--,inode = inode->i_next)
{
493 if (!inode->i_count) {
494 unsigned long i = 999;
495 if (!(inode->i_lock | inode->i_dirt))
496 i = inode->i_nrpages;
497 if (i < badness) {
498 best = inode;
499 if (!i)
500 goto found_good;
501 badness = i;
502 }
503 }
504 }

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/