Re: CONFIG_PREEMPT -> crash under load in 2.6.20?

From: Nix
Date: Sun Mar 04 2007 - 18:03:55 EST


On 4 Mar 2007, Corey Hickey told this:
> Nix wrote:
>> I can't tell if magic sysrq dies, because as far as I know there's no
>> way to get magic sysrq to do much visible when you're in X, and I can't
>> get anything to go over the network kernel syslog because the network is
>> dead.
>
> You should still be able to use SysRQ even in X. I tested right now.
> 1. Have X running already and then start X in another VT
> $ X :2 vt10
> 2. Hit Alt+SysRQ+K
> ---> X dies, display gets corrupted, and keyboard input ignored
> 3. ssh in from another machine and switch back to the running X instance
> # chvt 7

The network's dead; that's impossible.

However, I'll try SAK followed by a reg-and-flags dump: I'd entirely
forgotten about SAK. If that doesn't work I'll try to get a kexec dump.
(This is all moot if, as seems likely, the system's too dead to respond
to the keyboard: we shall see.)

>> I could begin a (really laborious, ~1 day per iteration) bisection to
>> try to track this down, but before I start, has anyone seen this before?
>> Is its cause known?
>
> I've been seeing the same behavior under different circumstances; I
> don't know if it's related.
>
> http://www.uwsg.iu.edu/hypermail/linux/kernel/0703.0/1147.html

That's using _VOLUNTARY, so I sort of doubt that it's the *same*
problem, but I suppose it might be related. (But this is hypothesis in
the absence of data :) )

> For me, though, CONFIG_PREEMPT doesn't seem to have an effect. One thing
> I had noticed, however, is that with the problem I'm experiencing,
> changing some config options inexplicably delayed the onset of the
> lockup, but the lockup occurred nonetheless. Have you been running with
> CONFIG_PREEMPT_VOLUNTARY for a while now and not seen any problems at all?

22:58:47 up 10 days, 22:20, 37 users, load average: 12.71, 11.14, 18.22

No problems, and I've been loading the system really rather hard today
(as that line makes clear). I think the problem I'm seeing really *is*
tied to _PREEMPT.

> It might be helpful if you reported your hardware information; I'd be
> interested in seeing if there's much in common with my own machine.

Athlon 4 (UP), 768Mb RAM. No ACPI (to rule out a large nasty spot as
soon as possible). Random info starting with loaded modules:

loop 11080 0 - Live 0xf0a56000
radeon 107808 2 - Live 0xf0a5c000
drm 62356 3 radeon, Live 0xf0a2f000

processor : 0
vendor_id : AuthenticAMD
cpu family : 6
model : 6
model name : AMD Athlon(tm) Processor
stepping : 2
cpu MHz : 1250.178
cache size : 256 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse syscall mmxext 3dnowext 3dnow ts
bogomips : 2501.59
clflush size : 32

CPU0
0: 944756984 IO-APIC-edge timer
1: 1397516 IO-APIC-edge i8042
2: 0 XT-PIC-XT cascade
6: 5 IO-APIC-edge floppy
7: 0 IO-APIC-edge parport0
8: 1 IO-APIC-edge rtc
12: 2877936 IO-APIC-edge i8042
14: 20262275 IO-APIC-edge ide2
16: 330856361 IO-APIC-fasteoi EMU10K1, radeon@pci:0000:01:00.0
18: 115648745 IO-APIC-fasteoi gordianet
19: 11059053 IO-APIC-fasteoi ide0, ide1
21: 0 IO-APIC-fasteoi uhci_hcd:usb1, uhci_hcd:usb2
NMI: 0
LOC: 944792893
ERR: 0
MIS: 49


0000-001f : dma1
0020-0021 : pic1
0040-0043 : timer0
0050-0053 : timer1
0060-006f : keyboard
0070-0077 : rtc
0080-008f : dma page reg
00a0-00a1 : pic2
00c0-00df : dma2
00f0-00ff : fpu
0170-0177 : 0000:00:11.1
01f0-01f7 : 0000:00:11.1
01f0-01f7 : ide2
0295-0296 : w83627hf
0376-0376 : 0000:00:11.1
0378-037a : parport0
037b-037f : parport0
03c0-03df : vga+
03f2-03f5 : floppy
03f6-03f6 : 0000:00:11.1
03f6-03f6 : ide2
03f7-03f7 : floppy DIR
03f8-03ff : serial
0400-0407 : vt596_smbus
0cf8-0cff : PCI conf1
b000-bfff : PCI Bus #01
b800-b8ff : 0000:01:00.0
c800-c81f : 0000:00:11.2
c800-c81f : uhci_hcd
cc00-cc1f : 0000:00:11.3
cc00-cc1f : uhci_hcd
d000-d07f : 0000:00:07.0
d400-d41f : 0000:00:05.0
d400-d41f : EMU10K1
d800-d80f : 0000:00:0c.0
d800-d807 : ide0
d808-d80f : ide1
dc00-dc03 : 0000:00:0c.0
dc02-dc02 : ide1
e000-e007 : 0000:00:0c.0
e000-e007 : ide1
e400-e403 : 0000:00:0c.0
e402-e402 : ide0
e800-e807 : 0000:00:0c.0
e800-e807 : ide0
ec00-ec07 : 0000:00:05.1
ec00-ec07 : emu10k1-gp
fc00-fc0f : 0000:00:11.1
fc00-fc07 : ide2
fc08-fc0f : ide3


Here's some lspci output:

00:00.0 Host bridge: VIA Technologies, Inc. VT8366/A/7 [Apollo KT266/A/333]
Subsystem: VIA Technologies, Inc. Unknown device 0000
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort+ >SERR- <PERR-
Latency: 0
Region 0: Memory at e0000000 (32-bit, prefetchable) [size=128M]
Capabilities: [a0] AGP version 2.0
Status: RQ=32 Iso- ArqSz=0 Cal=0 SBA+ ITACoh- GART64- HTrans- 64bit- FW+ AGP3- Rate=x1,x2,x4
Command: RQ=1 ArqSz=0 Cal=0 SBA+ AGP+ GART64- 64bit- FW- Rate=x4
Capabilities: [c0] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-

00:01.0 PCI bridge: VIA Technologies, Inc. VT8366/A/7 [Apollo KT266/A/333 AGP] (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B-
Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort+ >SERR- <PERR-
Latency: 0
Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
I/O behind bridge: 0000b000-0000bfff
Memory behind bridge: dfe00000-dfefffff
Prefetchable memory behind bridge: bfc00000-dfcfffff
Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
BridgeCtl: Parity- SERR- NoISA+ VGA+ MAbort- >Reset- FastB2B-
Capabilities: [80] Power Management version 2
Flags: PMEClk- DSI- D1+ D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-

00:05.0 Multimedia audio controller: Creative Labs SB Live! EMU10k1 (rev 04)
Subsystem: Creative Labs CT4620 SBLive!
Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 32 (500ns min, 5000ns max)
Interrupt: pin A routed to IRQ 16
Region 0: I/O ports at d400 [size=32]
Capabilities: [dc] Power Management version 1
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-

00:05.1 Input device controller: Creative Labs SB Live! Game Port (rev 01)
Subsystem: Creative Labs Gameport Joystick
Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 32
Region 0: I/O ports at ec00 [size=8]
Capabilities: [dc] Power Management version 1
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-

00:07.0 Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone] (rev 30)
Subsystem: 3Com Corporation 3C905B Fast Etherlink XL 10/100
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 32 (2500ns min, 2500ns max), Cache Line Size: 32 bytes
Interrupt: pin A routed to IRQ 18
Region 0: I/O ports at d000 [size=128]
Region 1: Memory at dfff7f80 (32-bit, non-prefetchable) [size=128]
Expansion ROM at dffc0000 [disabled] [size=128K]
Capabilities: [dc] Power Management version 1
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1+,D2+,D3hot+,D3cold+)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-

00:0b.0 USB Controller: NEC Corporation USB (rev 41) (prog-if 10 [OHCI])
Subsystem: NEC Corporation Hama USB 2.0 CardBus
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 32 (250ns min, 10500ns max), Cache Line Size: 32 bytes
Interrupt: pin A routed to IRQ 18
Region 0: Memory at dfff5000 (32-bit, non-prefetchable) [size=4K]
Capabilities: [40] Power Management version 2
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-

00:0b.1 USB Controller: NEC Corporation USB (rev 41) (prog-if 10 [OHCI])
Subsystem: NEC Corporation Hama USB 2.0 CardBus
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 32 (250ns min, 10500ns max), Cache Line Size: 32 bytes
Interrupt: pin B routed to IRQ 19
Region 0: Memory at dfff6000 (32-bit, non-prefetchable) [size=4K]
Capabilities: [40] Power Management version 2
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-

00:0b.2 USB Controller: NEC Corporation USB 2.0 (rev 02) (prog-if 20 [EHCI])
Subsystem: Micro-Star International Co., Ltd. Unknown device 3504
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 32 (4000ns min, 8500ns max), Cache Line Size: 32 bytes
Interrupt: pin C routed to IRQ 16
Region 0: Memory at dfff7e00 (32-bit, non-prefetchable) [size=256]
Capabilities: [40] Power Management version 2
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-

00:0c.0 RAID bus controller: Promise Technology, Inc. PDC20276 (MBFastTrak133 Lite) (rev 01) (prog-if 85)
Subsystem: Promise Technology, Inc. MBFastTrak133 Lite (tm) Controller (RAID mode)
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=slow >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 32 (1000ns min, 4500ns max), Cache Line Size: 32 bytes
Interrupt: pin A routed to IRQ 19
Region 0: I/O ports at e800 [size=8]
Region 1: I/O ports at e400 [size=4]
Region 2: I/O ports at e000 [size=8]
Region 3: I/O ports at dc00 [size=4]
Region 4: I/O ports at d800 [size=16]
Region 5: Memory at dfffc000 (32-bit, non-prefetchable) [size=16K]
Capabilities: [60] Power Management version 1
Flags: PMEClk- DSI+ D1+ D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-

00:11.0 ISA bridge: VIA Technologies, Inc. VT8233A ISA Bridge
Subsystem: VIA Technologies, Inc. Unknown device 0000
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping+ SERR- FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 0
Capabilities: [c0] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-

00:11.1 IDE interface: VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06) (prog-if 8a [Master SecP PriP])
Subsystem: VIA Technologies, Inc. VT82C586/B/VT82C686/A/B/VT8233/A/C/VT8235 PIPC Bus Master IDE
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 32
Interrupt: pin A routed to IRQ 16
Region 0: [virtual] Memory at 000001f0 (32-bit, non-prefetchable) [size=8]
Region 1: [virtual] Memory at 000003f0 (type 3, non-prefetchable) [size=1]
Region 2: [virtual] Memory at 00000170 (32-bit, non-prefetchable) [size=8]
Region 3: [virtual] Memory at 00000370 (type 3, non-prefetchable) [size=1]
Region 4: I/O ports at fc00 [size=16]
Capabilities: [c0] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-

00:11.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 23) (prog-if 00 [UHCI])
Subsystem: VIA Technologies, Inc. (Wrong ID) USB Controller
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 32, Cache Line Size: 32 bytes
Interrupt: pin D routed to IRQ 21
Region 4: I/O ports at c800 [size=32]
Capabilities: [80] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-

00:11.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 23) (prog-if 00 [UHCI])
Subsystem: VIA Technologies, Inc. (Wrong ID) USB Controller
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 32, Cache Line Size: 32 bytes
Interrupt: pin D routed to IRQ 21
Region 4: I/O ports at cc00 [size=32]
Capabilities: [80] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-

01:00.0 VGA compatible controller: ATI Technologies Inc RV280 [Radeon 9200 PRO] (rev 01) (prog-if 00 [VGA])
Subsystem: ATI Technologies Inc RV280 [Radeon 9200 PRO]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 32 (2000ns min), Cache Line Size: 32 bytes
Interrupt: pin A routed to IRQ 16
Region 0: Memory at d0000000 (32-bit, prefetchable) [size=128M]
Region 1: I/O ports at b800 [size=256]
Region 2: Memory at dfef0000 (32-bit, non-prefetchable) [size=64K]
Expansion ROM at dfec0000 [disabled] [size=128K]
Capabilities: [58] AGP version 2.0
Status: RQ=80 Iso- ArqSz=0 Cal=0 SBA+ ITACoh- GART64- HTrans- 64bit- FW+ AGP3- Rate=x1,x2,x4
Command: RQ=32 ArqSz=0 Cal=0 SBA+ AGP+ GART64- 64bit- FW- Rate=x4
Capabilities: [50] Power Management version 2
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-

01:00.1 Display controller: ATI Technologies Inc RV280 [Radeon 9200 PRO] (Secondary) (rev 01)
Subsystem: ATI Technologies Inc Unknown device 5961
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 32 (2000ns min), Cache Line Size: 32 bytes
Region 0: Memory at c8000000 (32-bit, prefetchable) [size=128M]
Region 1: Memory at dfee0000 (32-bit, non-prefetchable) [size=64K]
Capabilities: [50] Power Management version 2
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-



--
`In the future, company names will be a 32-character hex string.'
--- Bruce Schneier on the shortage of company names
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/