Re: 2.6.27.8-2 OOps e1000e

From: Christian Volkmann
Date: Mon Dec 22 2008 - 16:07:47 EST


Allan, Bruce W wrote:
> This is not an oops, it is a stack trace generated by a WARN() in e1000_acquire_swflag_ich8lan() added to the kernel to assist in debugging resource contention issues. The particular issue your stack trace shows has been resolved with commit 8452759060ad46fc071a7d5bbf1647df5ea2ceab already queued for 2.6.29.
>
Dear Bruce W,

thank you for the quick answer.
I have applied the patch to the modules.

Best regards,

Christian


>> -----Original Message-----
>> From: Christian Volkmann [mailto:CVolkmann@xxxxxxxxxxxxxxxx]
>> Sent: Monday, December 22, 2008 5:07 AM
>> To: e1000-devel@xxxxxxxxxxxxxxxxxxxxx; Kirsher, Jeffrey T; Brandeburg,
>> Jesse; Allan, Bruce W; Waskiewicz Jr, Peter P; Ronciak, John
>> Cc: linux-kernel@xxxxxxxxxxxxxxx
>> Subject: 2.6.27.8-2 OOps e1000e
>>
>> Hello,
>>
>> I got the following OOps from a opensuse 11.0 system with a 2.6.27.8-2
>> kernel.
>> The system should have been idle at the time of the oops.
>> I got the kernel from
>> ftp://ftp5.gwdg.de/pub/opensuse/repositories/home:/m4r3k:/kernel-
>> backport/openSUSE_11.0/x86_64/kernel-default-2.6.27.8-2.1.x86_64.rpm
>> where it disapeared.
>>
>> May be somebody can find a reason and a fix for the oops. There are 12
>> similar systems connect via 100Mbit(internal net, no regular use) and
>> 1000MBit (nfs4: each to each).
>> This is the second crash I got within the last two weeks. So it happens
>> very rare.
>>
>> Best regards,
>> Christian
>>
>> PS: I am do not read the mailing lists regular. I would be happy about a
>> CC in case of additional questions to me.
>>
>> lspci -nn
>> 00:00.0 Host bridge [0600]: Intel Corporation Eaglelake DRAM Controller
>> [8086:2e10] (rev 03)
>> 00:02.0 VGA compatible controller [0300]: Intel Corporation Eaglelake
>> Integrated Graphics Controller [8086:2e12] (rev 03)
>> 00:02.1 Display controller [0380]: Intel Corporation Eaglelake Integrated
>> Graphics Controller [8086:2e13] (rev 03)
>> 00:03.0 Communication controller [0780]: Intel Corporation Eaglelake HECI
>> Controller [8086:2e14] (rev 03)
>> 00:03.2 IDE interface [0101]: Intel Corporation Eaglelake PT IDER
>> Controller [8086:2e16] (rev 03)
>> 00:03.3 Serial controller [0700]: Intel Corporation Eaglelake Serial KT
>> Controller [8086:2e17] (rev 03)
>> 00:19.0 Ethernet controller [0200]: Intel Corporation Device [8086:10de]
>> (rev 02)
>> 00:1a.0 USB Controller [0c03]: Intel Corporation ICH10 USB UHCI Controller
>> #4 [8086:3a67] (rev 02)
>> 00:1a.1 USB Controller [0c03]: Intel Corporation ICH10 USB UHCI Controller
>> #5 [8086:3a68] (rev 02)
>> 00:1a.2 USB Controller [0c03]: Intel Corporation ICH10 USB UHCI Controller
>> #6 [8086:3a69] (rev 02)
>> 00:1a.7 USB Controller [0c03]: Intel Corporation ICH10 USB2 EHCI
>> Controller #2 [8086:3a6c] (rev 02)
>> 00:1b.0 Audio device [0403]: Intel Corporation ICH10 HD Audio Controller
>> [8086:3a6e] (rev 02)
>> 00:1c.0 PCI bridge [0604]: Intel Corporation ICH10 PCI Express Port 1
>> [8086:3a70] (rev 02)
>> 00:1c.4 PCI bridge [0604]: Intel Corporation ICH10 PCI Express Port 5
>> [8086:3a78] (rev 02)
>> 00:1d.0 USB Controller [0c03]: Intel Corporation ICH10 USB UHCI Controller
>> #1 [8086:3a64] (rev 02)
>> 00:1d.1 USB Controller [0c03]: Intel Corporation ICH10 USB UHCI Controller
>> #2 [8086:3a65] (rev 02)
>> 00:1d.2 USB Controller [0c03]: Intel Corporation ICH10 USB UHCI Controller
>> #3 [8086:3a66] (rev 02)
>> 00:1d.7 USB Controller [0c03]: Intel Corporation ICH10 USB2 EHCI
>> Controller #1 [8086:3a6a] (rev 02)
>> 00:1e.0 PCI bridge [0604]: Intel Corporation 82801 PCI Bridge [8086:244e]
>> (rev a2)
>> 00:1f.0 ISA bridge [0601]: Intel Corporation ICH10 LPC Interface
>> Controller [8086:3a14] (rev 02)
>> 00:1f.2 SATA controller [0106]: Intel Corporation ICH10 6 port SATA AHCI
>> Controller [8086:3a02] (rev 02)
>> 30:00.0 Ethernet controller [0200]: Intel Corporation 82572EI Gigabit
>> Ethernet Controller (Copper) [8086:10b9] (rev 06)
>>
>>
>> Dec 21 19:55:23 padmnrte08 smartd[3240]: Device: /dev/sda, SMART Usage
>> Attribute: 190 Airflow_Temperature_Cel changed from 65 to 66
>> Dec 21 19:55:23 padmnrte08 smartd[3240]: Device: /dev/sda, SMART Usage
>> Attribute: 194 Temperature_Celsius changed from 35 to 34
>> Dec 21 20:19:22 padmnrte08 kernel: ------------[ cut here ]------------
>> Dec 21 20:19:22 padmnrte08 kernel: WARNING: at
>> drivers/net/e1000e/ich8lan.c:424 e1000_acquire_swflag_ich8lan+0x37/0xb9
>> [e1000e]()
>> Dec 21 20:19:22 padmnrte08 kernel: e1000e mutex contention. Owned by pid
>> 2947
>> Dec 21 20:19:22 padmnrte08 kernel: Modules linked in: raw nfsd auth_rpcgss
>> exportfs binfmt_misc nfs lockd nfs_acl sunrpc iptable_filter ip_tables
>> ip6table_filter ip6_tables x_tables ipv6 cpufreq_conservative
>> cpufreq_userspace cpufreq_powersave acpi_cpufreq microcode fuse loop
>> dm_mod snd_hda_intel snd_pcm snd_timer snd_page_alloc snd_hwdep
>> ide_pci_generic rtc_cmos snd rtc_core video ide_core output serio_raw
>> pcspkr sr_mod button soundcore rtc_lib intel_agp ata_generic wmi cdrom sg
>> floppy sd_mod crc_t10dif ehci_hcd uhci_hcd usbcore e1000e edd ext3 mbcache
>> jbd fan pata_acpi ahci libata scsi_mod dock thermal processor thermal_sys
>> hwmon
>> Dec 21 20:19:22 padmnrte08 kernel: Supported: Yes
>> Dec 21 20:19:22 padmnrte08 kernel: Pid: 2947, comm: irqbalance Not tainted
>> 2.6.27.8-2-default #1
>> Dec 21 20:19:22 padmnrte08 kernel:
>> Dec 21 20:19:22 padmnrte08 kernel: Call Trace:
>> Dec 21 20:19:22 padmnrte08 kernel: [<ffffffff8020e53e>]
>> show_trace_log_lvl+0x41/0x58
>> Dec 21 20:19:22 padmnrte08 kernel: [<ffffffff804a94d9>]
>> dump_stack+0x69/0x6f
>> Dec 21 20:19:22 padmnrte08 kernel: [<ffffffff80241640>]
>> warn_slowpath+0xb4/0xdc
>> Dec 21 20:19:22 padmnrte08 kernel: [<ffffffffa00cbda3>]
>> e1000_acquire_swflag_ich8lan+0x37/0xb9 [e1000e]
>> Dec 21 20:19:22 padmnrte08 kernel: [<ffffffffa00cc0c7>]
>> e1000_read_nvm_ich8lan+0x54/0xfd [e1000e]
>> Dec 21 20:19:22 padmnrte08 kernel: [<ffffffffa00d3e1c>]
>> e1000_get_drvinfo+0x61/0xe9 [e1000e]
>> Dec 21 20:19:22 padmnrte08 kernel: [<ffffffff8043a86d>]
>> ethtool_get_drvinfo+0x4a/0x11f
>> Dec 21 20:19:22 padmnrte08 kernel: [<ffffffff8043b120>]
>> dev_ethtool+0x164/0x8cc
>> Dec 21 20:19:22 padmnrte08 kernel: [<ffffffff80439fae>]
>> dev_ioctl+0x36e/0x4ab
>> Dec 21 20:19:22 padmnrte08 kernel: [<ffffffff8042b7ca>]
>> sock_ioctl+0x1ec/0x1f6
>> Dec 21 20:19:22 padmnrte08 kernel: [<ffffffff802c7809>]
>> vfs_ioctl+0x21/0x6c
>> Dec 21 20:19:22 padmnrte08 kernel: [<ffffffff802c7a93>]
>> do_vfs_ioctl+0x23f/0x255
>> Dec 21 20:19:22 padmnrte08 kernel: [<ffffffff802c7afa>]
>> sys_ioctl+0x51/0x73
>> Dec 21 20:19:22 padmnrte08 kernel: [<ffffffff8020c3fa>]
>> system_call_fastpath+0x16/0x1b
>> Dec 21 20:19:22 padmnrte08 kernel: [<00007fd570eb9b67>] 0x7fd570eb9b67
>> Dec 21 20:19:22 padmnrte08 kernel:
>> Dec 21 20:19:22 padmnrte08 kernel: ---[ end trace 4bb48ce3e2b90187 ]---
>> Dec 21 20:55:08 padmnrte08 syslog-ng[2567]: STATS: dropped 0
>> Dec 21 20:55:23 padmnrte08 smartd[3240]: Device: /dev/sda, SMART Usage
>> Attribute: 190 Airflow_Temperature_Cel changed from 66 to 65
>> Dec 21 20:55:23 padmnrte08 smartd[3240]: Device: /dev/sda, SMART Usage
>> Attribute: 194 Temperature_Celsius changed from 34 to 35
>> Dec 21 21:55:08 padmnrte08 syslog-ng[2567]: STATS: dropped 0
>> Dec 21 22:19:08 padmnrte08 kernel: nfs: server rte11 not responding, timed
>> out
>> Dec 21 22:20:09 padmnrte08 kernel: nfs4_reclaim_open_state: unhandled
>> error -116. Zeroing state
>> Dec 21 22:55:08 padmnrte08 syslog-ng[2567]: STATS: dropped 0
>> Dec 21 23:55:08 padmnrte08 syslog-ng[2567]: STATS: dropped 0
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/