I posted a message to comp.os.linux.misc a few days back about this,
hoping for some insight, but got no responses. I then ran across some
advice (to someone else) that oopses should be sent here, so here I am.
The Problem:
For background, I am running kernel 2.2.6-ac1 and lm_sensors 2.3.0.
I think I may have uncovered some kind of race condition. I was
unloading the lm_sensors modules with 'rmmod', apparently just at the same
time that a 'cat' process was trying to cat the values in
/proc/sys/dev/sensors/lm79-i2c-0-2d/temp . Five things have resulted:
1) I got a kernel oops. :-)
2) the cat process is hung (forunately not eating up resources). I cannot
kill it, even with signal 9 as root. A 'ps' shows it is hung on
process c01baca8, which does not appear in System.map.
3) After reinserting the modules, I have 2 files named 'temp' in the
directory above:
13:58:14 (/root) root% ls /proc/sys/dev/sensors/lm79-i2c-0-2d
total 0
dr-xr-xr-x 2 root root 0 Apr 24 19:55 ./
dr-xr-xr-x 3 root root 0 Apr 24 19:55 ../
-r--r--r-- 1 root root 0 Apr 28 13:58 alarms
-rw-r--r-- 1 root root 0 Apr 28 13:58 fan1
-rw-r--r-- 1 root root 0 Apr 28 13:58 fan2
-rw-r--r-- 1 root root 0 Apr 28 13:58 fan3
-rw-r--r-- 1 root root 0 Apr 28 13:58 fan_div
-rw-r--r-- 1 root root 0 Apr 28 13:58 in0
-rw-r--r-- 1 root root 0 Apr 28 13:58 in1
-rw-r--r-- 1 root root 0 Apr 28 13:58 in2
-rw-r--r-- 1 root root 0 Apr 28 13:58 in3
-rw-r--r-- 1 root root 0 Apr 28 13:58 in4
-rw-r--r-- 1 root root 0 Apr 28 13:58 in5
-rw-r--r-- 1 root root 0 Apr 28 13:58 in6
-rw-r--r-- 1 root root 0 Apr 24 19:55 temp
-rw-r--r-- 1 root root 0 Apr 24 19:55 temp
-r--r--r-- 1 root root 0 Apr 28 13:58 vid
13:58:25 (/root) root%
4) I cannot 'rmmod' the lm78 module anymore (my guess is that this is the
module that the cat process is hung in):
14:01:14 (/root) root% lsmod
Module Size Used by
i2c-piix4 2732 0 (unused)
i2c-proc 2376 0
lm78 6540 1
sensors 5056 1 [lm78]
smbus 1588 0 [i2c-piix4 lm78 sensors]
i2c-core 3604 0 [i2c-piix4 i2c-proc lm78 sensors smbus]
14:01:16 (/root) root% rmmod lm78
rmmod: lm78: Device or resource busy
14:01:17 (/root) root%
5) Now, whenever I 'cat' the temp file in the directory above, one of
three things can happen, and it seems to dependi on who does executes
it:
14:03:18 (/home/jason) jason% cat /proc/sys/dev/sensors/lm79-i2c-0-2d/temp
cat: /proc/sys/dev/sensors/lm79-i2c-0-2d/temp: Operation not permitted
14:03:19 (/home/jason) jason%
14:03:05 (/root) root% cat /proc/sys/dev/sensors/lm79-i2c-0-2d/temp
Segmentation fault
14:03:06 (/root) root% cat /proc/sys/dev/sensors/lm79-i2c-0-2d/temp
cat: /proc/sys/dev/sensors/lm79-i2c-0-2d/temp: Not a directory
14:06:04 (/root) root%
*** In the case where I get "Segmentation fault", I get a new oops! ***
*** They are all very similar, so I will include a sample one below. ***
===========================================================================
This has not caused any system instability whatsoever, so I have not
rebooted since it happened. This way, I can hopefully provide more
information to whoever needs it. If any more information is needed,
please feel free to contact me. Chances are I will not reboot for a
while.
Thanks,
Jason
=======================
The output of ksymoops:
=======================
Options used: -V (default)
-o /lib/modules/2.2.6-ac1/ (default)
-k /proc/ksyms (default)
-l /proc/modules (default)
-m /usr/src/linux/System.map (default)
-c 1 (default)
-a - same as ksymoops (default)
You did not tell me where to find symbol information. I will assume
that the log matches the kernel and modules that are running right now
and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get
more accurate output by telling me the kernel version and where to find
map, modules, ksyms etc. ksymoops -h explains the options.
Warning in compare_ksyms_lsmod, module i2c-piix4 is in lsmod but not in ksyms, probably no symbols exported
Warning in compare_ksyms_lsmod, module i2c-proc is in lsmod but not in ksyms, probably no symbols exported
Warning in compare_ksyms_lsmod, module lm78 is in lsmod but not in ksyms, probably no symbols exported
Warning: cannot match loaded module sensors to a unique module object. Trace may not be reliable.
Warning: cannot match loaded module smbus to a unique module object. Trace may not be reliable.
Warning: cannot match loaded module i2c-core to a unique module object. Trace may not be reliable.
kernel: Unable to handle kernel paging request at virtual address 083ccff0
kernel: Code: <1>Unable to handle kernel paging request at virtual address 083ccff0
7 warnings issued. Results may not be reliable.
==============================================================================
===============
An actual oops:
===============
kernel: Unable to handle kernel paging request at virtual address 083ccff0
kernel: current->tss.cr3 = 02533000, %cr3 = 02533000
kernel: *pde = 02114067
kernel: *pte = 00000000
kernel: Oops: 0000
kernel: CPU: 0
kernel: EIP: 0010:[<083ccff0>]
kernel: EFLAGS: 00010246
kernel: eax: 083ccff0 ebx: c127fac0 ecx: 0804aef8 edx: 0804aef8
kernel: esi: c302e7e0 edi: 00000000 ebp: 00001000 esp: c2ee1f50
kernel: ds: 0018 es: 0018 ss: 0018
kernel: Process cat (pid: 24905, process nr: 112, stackpage=c2ee1000)
kernel: Stack: 00000000 c302e7e0 0804aef8 c2ee1f70 c302e7e0 0804aef8 0804aef8 00001000
kernel: 00001000 c01177dc 00000000 c302e7e0 0804aef8 00001000 c302e7f4 c302e7e0
kernel: c0123c46 c302e7e0 0804aef8 00001000 c302e7f4 c2ee0000 00001000 0804aef8
kernel: Call Trace: [proc_readsys+28/36] [sys_read+190/220] [system_call+52/56]
kernel: Code: <1>Unable to handle kernel paging request at virtual address 083ccff0
kernel: current->tss.cr3 = 02533000, %cr3 = 02533000
kernel: *pde = 02114067
kernel: *pte = 00000000
kernel: Oops: 0000
kernel: CPU: 0
kernel: EIP: 0010:[show_registers+619/668]
kernel: EFLAGS: 00010082
kernel: eax: 083ccff0 ebx: 00000000 ecx: c01facd8 edx: c2568000
kernel: esi: 0000002b edi: c2ee2000 ebp: c4800000 esp: c2ee1e94
kernel: ds: 0018 es: 0018 ss: 0018
kernel: Process cat (pid: 24905, process nr: 112, stackpage=c2ee1000)
kernel: Stack: c2ee0000 c3fde160 c01face2 c302e7e0 00000000 00001000 083ccff0 c127fac0
kernel: 0804aef8 0804aef8 083ccff0 00010246 c4800000 c5000000 c0108f6c c2ee1f14
kernel: c01bdb0c c01beaf4 00000000 00000000 c010df16 c01beaf4 c2ee1f14 00000000
kernel: Call Trace: [<c4800000>] [<c5000000>] [die+48/56] [stext_lock+4348/8540] [stext_lock+8420/8540] [do_page_fault+710/792] [stext_lock+8420/8540]
kernel: [error_code+45/52] [__down_interruptible+156/184] [do_rw_proc+130/160] [proc_readsys+28/36] [sys_read+190/220] [system_call+52/56]
kernel: Code: 0f b6 0c 03 89 4c 24 38 51 68 04 db 1b c0 e8 3a 98 00 00 83
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/