Re: kernel BUG at kernel/workqueue.c:291

From: Carsten Aulbert
Date: Mon Mar 02 2009 - 05:52:18 EST


Hi again,

in the mean time 43 of our nodes were struck with this error. It seems
that the jobs of a certain user can trigger this bug, however I have no
clue how to really trigger it manually.

My questions:
Is this a know bug for 2.6.27.14 (we can upgrade to .19 if necessary),
but as this file was not modyfied recently, I suspect there is no ready
fix for that.

Do you need any more info of our systems (Intel X3220 based Supermirco
systems), the kernel config (deadline scheduler in use,...) or something
else?

Carsten Aulbert schrieb:
> [228704.928037] ------------[ cut here ]------------
> [228704.928224] kernel BUG at kernel/workqueue.c:291!
> [228704.928404] invalid opcode: 0000 [1] SMP
> [228704.928647] CPU 0
> [228704.928852] Modules linked in: lm92 w83793 w83781d hwmon_vid hwmon nfs nfsd lockd nfs_acl auth_rpcgss sunrpc exportfs autofs4 netconsole configfs ipmi_si ipmi_devintf ipmi_watchdog ipmi_poweroff ipmi_msghandler e1000e i2c_i801 8250_pnp 8250 serial_core i2c_core
> [228704.930002] Pid: 1609, comm: rpciod/0 Not tainted 2.6.27.14-nodes #1
> [228704.930002] RIP: 0010:[<ffffffff8023c6db>] [<ffffffff8023c6db>] run_workqueue+0x6f/0x102
> [228704.930002] RSP: 0018:ffff880214bcdec0 EFLAGS: 00010207
> [228704.930002] RAX: 0000000000000000 RBX: ffff880214b82f40 RCX: ffff880215444418
> [228704.930002] RDX: ffff880187d07d58 RSI: ffff880214bcdee0 RDI: ffff880215444410
> [228704.930002] RBP: ffffffffa0077186 R08: ffff880214bcc000 R09: ffff88021491f808
> [228704.930002] R10: 0000000000000246 R11: ffff880187d07d50 R12: ffff880214ad7d28
> [228704.930002] R13: ffffffff806065a0 R14: ffffffff80607280 R15: 0000000000000000
> [228704.930002] FS: 0000000000000000(0000) GS:ffffffff80636040(0000) knlGS:0000000000000000
> [228704.930002] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> [228704.930002] CR2: 00007fc056333fd8 CR3: 00000001ed270000 CR4: 00000000000006e0
> [228704.930002] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [228704.930002] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [228704.930002] Process rpciod/0 (pid: 1609, threadinfo ffff880214bcc000, task ffff880217b08780)
> [228704.930002] Stack: ffff880214b82f40 ffff880214b82f40 ffff880214b82f58 ffffffff8023cff3
> [228704.930002] 0000000000000000 ffff880217b08780 ffffffff8023f7d7 ffff880214bcdef8
> [228704.930002] ffff880214bcdef8 ffffffff806065a0 ffffffff80607280 ffff880214b82f40
> [228704.930002] Call Trace:
> [228704.930002] [<ffffffff8023cff3>] ? worker_thread+0x90/0x9b
> [228704.930002] [<ffffffff8023f7d7>] ? autoremove_wake_function+0x0/0x2e
> [228704.930002] [<ffffffff8023cf63>] ? worker_thread+0x0/0x9b
> [228704.930002] [<ffffffff8023f6c2>] ? kthread+0x47/0x75
> [228704.930002] [<ffffffff8022afa8>] ? schedule_tail+0x27/0x5f
> [228704.930002] [<ffffffff8020ccb9>] ? child_rip+0xa/0x11
> [228704.930002] [<ffffffff8023f67b>] ? kthread+0x0/0x75
> [228704.930002] [<ffffffff8020ccaf>] ? child_rip+0x0/0x11
> [228704.930002]
> [228704.930002]
> [228704.930002] Code: 6f 18 48 89 7b 30 48 8b 11 48 8b 41 08 48 89 42 08 48 89 10 48 89 49 08 48 89 09 fe 03 fb 48 8b 41 f8 48 83 e0 fc 48 39 d8 74 04 <0f> 0b eb fe f0 80 61 f8 fe ff d5 65 48 8b 04 25 10 00 00 00 8b
> [228704.930002] RIP [<ffffffff8023c6db>] run_workqueue+0x6f/0x102
> [228704.930002] RSP <ffff880214bcdec0>
> [228704.941003] ---[ end trace deef6e5387b5a584 ]---

Thanks for any input, for reight now I'm quite helpless....

Cheers

Carsten
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/