Re: smbfs and mainstream 2.1.14

J. Sean Connell (ankh@canuck.gen.nz)
Thu, 12 Dec 1996 20:31:37 +1300 (NZDT)


On Wed, 11 Dec 1996, Regis DUCHESNE wrote:

> Today, I tried to mount a share which was shared for all (without
> password) at a user-level by a Win95 machine whose Netbios name is
> ARBITRE and whose DNS name is arrouart.via.ecp.fr (ip = 138.195.131.157,
> the machine is in the DNS)
>
> [rest of description of what happened snipped]

This happened to me the other day too, except mine got worse. I tried
multiple times to mount the machine in question, fiddling with different
options, and having no success. Then, while watching a tcpdump, I got a bunch
of (non-fatal) panics related to "unable to handle kernel paging request at
virtual address <address>" messages; my system kept on goin', and I had a
seven-day uptime, so I ignored it.

Until a few hours later I realized that when I logged off any console, a new
getty wasn't coming up.

Then I realized that init was marked D, and my load average was round about 3,
and there were about 500 zombie processes.

So of course, I tried to do a sync and a reboot; I guessed (correctly) that
shutdown would be useless. The sync got stuck, the console is now useless.

Switching to another root console, I tried sync&, which created another stuck
sync (in the background this time :). Then I tried shutdown -r now, which of
course got stuck.

Now I'm getting worried. I do a ps -alx, and find out that all these stuck
processes are stuck in either __wait_on_super or sync_supers, probably the
latter (this was a few days ago and I didn't write it down).

"Eureka," think I. "It's because of the botched mount! Now how do I *fix*
it?!"

I eventually ended up shutting down most services manually, killing most
running processes, and unmounting everything except /, /usr, and /home (if I
could've I would've, but I had a stuck sync that was using /home/root for its
homedir...).

>From /var/log/syslog:

Dec 11 09:17:49 balance kernel: smb_dont_catch_keepalive: server->data_ready == NULL
Dec 11 09:18:24 balance kernel: smb_dont_catch_keepalive: server->data_ready == NULL

27 seconds later, at Dec 11 09:18:51 (after much fiddling around on my
behalf), *kaboom*:

Unable to handle kernel paging request at virtual address c204ea3c
current->tss.cr3 = 01414000, <r3 = 01414000
*pde = 00001063
*pte = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<c204ea3c>]
EFLAGS: 00010293
eax: ffffffa1 ebx: c13ce810 ecx: c0072bd4 edx: 0000005f
esi: 00000fe6 edi: c13ce810 ebp: c01e8da4 esp: c0072bfc
ds: 0018 es: 0018 ss: 0018
Process smbmount (pid: 30983, process nr: 23, stackpage=c0072000)
Stack: c140fa38 c13ce93c 00000010 00000000 c0072cf4 c204d573 c13ce810 c13ce810
c0072ef0 c13ceab8 c01e8da4 c0f59158 00003000 c0cccba0 c1231000 c01e7074
c011bafe c011bb8b c0c79010 015fd067 c011b9e0 080042dc c1369b98 c0e00414
Call Trace: [<c204d573>] [<c011bafe>] [<c011bb8b>] [<c011b9e0>] [<c010ab30>] [<c0120002>] [<c0110c9c>]
[<c010ab30>] [<c0132f24>] [<c013413d>] [<c2050e90>] [<c2050ea7>] [<c2050ebf>] [<c2050ed6>] [<c2050ee0>]
[<c2050eea>] [<c2050ef5>] [<c204e0b6>] [<c204f385>] [<c2051714>] [<c012ab69>] [<c012b013>] [<c20513d1>]
[<c012b5da>] [<c20513d1>] [<c20513d1>] [<c010aa08>]

>From ksymoops:
-----
Trace: c204d573
Trace: c011bafe <do_no_page+11e/370>
Trace: c011bb8b <do_no_page+1ab/370>
Trace: c011bb8b <do_no_page+1ab/370>
Trace: c010ab30 <error_code+30/40>
Trace: c0120002 <sys_mremap+1c2/310>
Trace: c0110c9c <do_page_fault+11c/320>
Trace: c010ab30 <error_code+30/40>
Trace: c0132f24 <padzero+54/70>
Trace: c013413d <load_elf_binary+b1d/bc0>
Trace: c2050e90
Trace: c2050ea7
Trace: c2050ebf
Trace: c2050ed6
Trace: c2050ee0
Trace: c2050eea
Trace: c2050ef5
Trace: c204e0b6
Trace: c204f385
Trace: c2051714
Trace: c012ab69 <read_super+c9/100>
Trace: c012b013 <do_mount+e3/150>
Trace: c20513d1
Trace: c012b5da <sys_mount+30a/350>
Trace: c20513d1
Trace: c20513d1
Trace: c010aa08 <system_call+38/40>
-----

I would show you the offending piece of code, but it panicked again too
quickly:

Code: <1>Unable to handle kernel paging request at virtual address c204ea3c
current->tss.cr3 = 01414000, <r3 = 01414000
*pde = 00001063
*pte = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<c010aefe>]
EFLAGS: 00010286
eax: 00000010 ebx: 0000002b ecx: c204ea3c edx: 00000010
esi: 00000000 edi: c0073000 ebp: c0072bc8 esp: c0072b70
ds: 0018 es: 0018 ss: 0018
Process smbmount (pid: 30983, process nr: 23, stackpage=c0072000)
Stack: c01b002b 00000000 0004e000 c0072bc8 c0e00414 c2000000 c2800000 c2000000
c0e00018 c0110e86 c01b18c9 c0072bc8 00000000 c0e00414 00000fe6 c13ce810
c01e8da4 c0a3d414 c1369b98 c010ab30 c0072bc8 00000000 c13ce810 c0072bd4
Call Trace: [<c01b002b>] [<c2000000>] [<c2800000>] [<c2000000>] [<c0110e86>] [<c010ab30>] [<c204ea3c>]
[<c204d573>] [<c011bafe>] [<c011bb8b>] [<c011b9e0>] [<c010ab30>] [<c0120002>] [<c0110c9c>] [<c010ab30>]
[<c0132f24>] [<c013413d>] [<c2050e90>] [<c2050ea7>] [<c2050ebf>] [<c2050ed6>] [<c2050ee0>] [<c2050eea>]
[<c2050ef5>] [<c204e0b6>] [<c204f385>] [<c2051714>] [<c012ab69>] [<c012b013>] [<c20513d1>] [<c012b5da>]
[<c20513d1>] [<c20513d1>] [<c010aa08>]
Code: 64 8a 04 0e 0f a1 88 c2 81 e2 ff 00 00 00 89 54 24 10 52 68

>From ksymoops:
-----
>>EIP: c010aefe <die_if_kernel+26e/2c0>
Trace: c01b002b <sprintf+109b/18c5>
Trace: c2000000
Trace: c2800000
Trace: c2000000
Trace: c0110e86 <do_page_fault+306/320>
Trace: c010ab30 <error_code+30/40>
Trace: c204ea3c
Trace: c204d573
Trace: c011bafe <do_no_page+11e/370>
Trace: c011bb8b <do_no_page+1ab/370>
Trace: c011bb8b <do_no_page+1ab/370>
Trace: c010ab30 <error_code+30/40>
Trace: c0120002 <sys_mremap+1c2/310>
Trace: c0110c9c <do_page_fault+11c/320>
Trace: c010ab30 <error_code+30/40>
Trace: c0132f24 <padzero+54/70>
Trace: c013413d <load_elf_binary+b1d/bc0>
Trace: c2050e90
Trace: c2050ea7
Trace: c2050ebf
Trace: c2050ed6
Trace: c2050ee0
Trace: c2050eea
Trace: c2050ef5
Trace: c204e0b6
Trace: c204f385
Trace: c2051714
Trace: c012ab69 <read_super+c9/100>
Trace: c012b013 <do_mount+e3/150>
Trace: c20513d1
Trace: c012b5da <sys_mount+30a/350>
Trace: c20513d1
Trace: c20513d1
Trace: c010aa08 <system_call+38/40>

Code: c010aefe <die_if_kernel+26e/2c0> movb %fs:(%esi,%ecx,1),%al
Code: c010af02 <die_if_kernel+272/2c0> popl %fs
Code: c010af04 <die_if_kernel+274/2c0> movb %al,%dl
Code: c010af06 <die_if_kernel+276/2c0> andl $0xff,%edx
Code: c010af0c <die_if_kernel+27c/2c0> movl %edx,0x10(%esp,1)
Code: c010af10 <die_if_kernel+280/2c0> pushl %edx
Code: c010af11 <die_if_kernel+281/2c0> pushl $0x90909000
-----

Hope this-all helps someone out there :)

--
Jeffrey Connell            | Systems Adminstrator, ICONZ
ankh@canuck.gen.nz         | Any opinions stated above are not my employers',
ankh@iconz.co.nz           | not my boyfriend's, my priest's, my God's, my
#include <stddisc.h>       | my friends', and probably not even my own.
---------------------------+--------------------------------------------------
Fingerprint: 1024/2B8B116D | Key at http://www.canuck.gen.nz/~ankh/pgpkey.html