Re: 2.0.36 oops and sendmail

Allanah Myles (dossy@panoptic.com)
Mon, 18 Jan 1999 15:05:31 -0500


On 1999.01.03, Peter Green <pcg@gospelcom.net> wrote:
> [Please CC any responses if posting to the list. Thanks!]
>
> Found the following in my syslog:
[ ... oops omitted ... ]

> Those last three lines are repeated a NUMBER of times before the GPF
> (starting around 11 hours ago) and after the GPF. Additionally, the
> machine is spawning sendmail processes that don't respond to 'kill -9',
> nor do they ever seem to finish.
>
> Please let me know what other information I can supply in order to figure
> out what's going on. Thanks!

Well, my suggestion is to first cut-and-paste a complete kernel oops from
your syslog into a file and run it thorugh ksymoops. Look at where ksymoops
think you the panic was.

I'm seeing similar things in the 2.0.36 kernel, and I'm beginning to think
it's a race condition between the ext2fs driver and the tcp/ip driver.

I've seen this same kernel oops in prior linux kernels, but it was much
less frequent than with 2.0.36 - I haven't been able to troubleshoot this,
as I have very little kernel debugging experience (if someone would like
to teach me the basics such as where to start, what software to use,
I'll go debug this one myself).

Every oops refers to:

>>> Process suck (pid: 594, process nr: 70, stackpage=025f6000)

Now, as comical as it might be to see "process suck" causing a kernel
oops, for those who don't know what "suck" is, it's a small program,
that's being maintained by Robert A. Yetman, that allows you to
"suck" a newsfeed from an NNTP server. Indeed, if it's generating
kernel oopses, it does indeed "suck". :-)

It seems to oops in the exact same way each time (as in, the ksymoops
is identical from oops to oops). Only for this process. It's quite
reproducible, just at a random time. *snicker*

If anyone cares to help track this one down, please send me e-mail.

If anyone thinks a re-write of "suck" in Perl would be a good idea,
let me know - I've contemplated doing this for quite some time now...

-Dossy

For scrutiny, here's the oops and ksymoops output that I get, repeatedly
and reliably with 2.0.36:

------------------------------------------------------------
<1>Unable to handle kernel paging request at virtual address f958de65
current->tss.cr3 = 0278f000, %cr3 = 0278f000
*pde = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<0010aefc>]
EFLAGS: 00010006
eax: 00000010 ebx: 0009002b ecx: 3958de65 edx: 00894810
esi: 00000000 edi: 025f7000 ebp: 025f6ea0 esp: 025f6e44
ds: 0018 es: 0018 fs: 0010 gs: 002b ss: 0018
Process suck (pid: 594, process nr: 70, stackpage=025f6000)
Stack: 001b002b 00000000 f958de65 025f6ea0 024d0018 05700000 05800000 05000000
001b0018 00111e3a 001b40d3 025f6ea0 00000000 00111b6c 00000202 00000018
00000000 0271d8ac 0290ce98 00000018 0010ab10 025f6ea0 00000000 027acee0
Call Trace: [<001b002b>] [<05700000>] [<05800000>] [<05000000>] [<001b0018>] [<00111e3a>] [<00111b6c>]
[<0010ab10>] [<0014df34>] [<0014df52>] [<00149f48>] [<0015572d>] [<00139caf>] [<0013abe9>] [<05050206>]
[<0010a985>]
Code: 64 8a 04 0e 0f a1 88 c2 81 e2 ff 00 00 00 89 54 24 10 52 68
------------------------------------------------------------

The ksymoops output:

------------------------------------------------------------

Using `/boot/System.map' to map addresses to symbols.

>>EIP: 10aefc <die_if_kernel+280/2c0>
Trace: 1b002b <sd_init+157/220>
Trace: 5700000
Trace: 5800000
Trace: 5000000
Trace: 1b0018 <sd_init+144/220>
Trace: 111e3a <do_page_fault+2ce/2e0>
Trace: 111e3a <do_page_fault+2ce/2e0>
Trace: 10ab10 <error_code+40/50>
Trace: 14df34 <tcp_dequeue_partial+28/34>
Trace: 14df52 <tcp_send_partial+12/2c>
Trace: 149f48 <tcp_sendmsg+bc/d8>
Trace: 15572d <inet_sendmsg+95/ac>
Trace: 139caf <sys_send+13f/174>
Trace: 13abe9 <sys_socketcall+1c1/2dc>
Trace: 5050206
Trace: 10a985 <system_call+55/80>

Code: 10aefc <die_if_kernel+280/2c0>
Code: 10aefc <die_if_kernel+280/2c0> 64 8a 04 0e movb %fs:(%esi,%ecx,1),%al
Code: 10af00 <die_if_kernel+284/2c0> 0f a1 popl %fs
Code: 10af02 <die_if_kernel+286/2c0> 88 c2 movb %al,%dl
Code: 10af04 <die_if_kernel+288/2c0> 81 e2 ff 00 00 andl $0xff,%edx
Code: 10af0f <die_if_kernel+293/2c0> 00
Code: 10af10 <die_if_kernel+294/2c0> 89 54 24 10 movl %edx,0x10(%esp,1)
Code: 10af14 <die_if_kernel+298/2c0> 52 pushl %edx
Code: 10af15 <die_if_kernel+299/2c0> 68 00 90 90 90 pushl $0x90909000
------------------------------------------------------------

-- 
URL: http://www.panoptic.com/~dossy -< BORK BORK! >- E-MAIL: dossy@panoptic.com
    Now I'm who I want to be, where I want to be, doing what I've always said I
    would and yet I feel I haven't won at all...      (Aug 9, 95: Goodbye, JG.)
"You should change your .sig; not that the world revolves around me." -s. sadie

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/