Re: [SCSI BUG 2.6.15-rc3-mm1] scheduling while atomic on boot time

From: Andrew Morton
Date: Fri Dec 02 2005 - 14:31:30 EST


Wu Fengguang <wfg@xxxxxxxxxxxxxxxx> wrote:
>
> My server occasionally crashes on boot time, this has been happening in many
> recent kernel versions(at least from 2.6.14-rcx). It is rare enough, I setup
> netconsole and rebooted numerous times, but still failed to catch it. Luckily
> it happened again this time, and does not panic. Here is the logs.
>
> Thanks,
> Wu
>
> Error messages:
> [4294676.927000] scheduling while atomic: ksoftirqd/0/0x00000200/3
> [4294676.927000] [dump_stack+21/32] dump_stack+0x15/0x20
> [4294676.927000] [schedule+3563/3584] schedule+0xdeb/0xe00
> [4294676.927000] [__down+138/272] __down+0x8a/0x110
> [4294676.927000] [__sched_text_start+10/16] <6>scsi[0]: scanning scsi channel 1 [Phy 1] for non-raid devices
> [4294676.927000] __down_failed+0xa/0x10
> [4294676.927000] [.text.lock.main+43/71] .text.lock.main+0x2b/0x47
> [4294676.928000] [device_del+62/112] device_del+0x3e/0x70
> [4294676.928000] [scsi_target_reap+137/176] scsi_target_reap+0x89/0xb0
> [4294676.928000] [scsi_device_dev_release+251/400] scsi_device_dev_release+0xfb/0x190
> [4294676.928000] [device_release+23/80] device_release+0x17/0x50
> [4294676.928000] [kobject_cleanup+116/128] kobject_cleanup+0x74/0x80
> [4294676.928000] [kobject_release+11/16] kobject_release+0xb/0x10
> [4294676.929000] [kref_put+52/160] kref_put+0x34/0xa0
> [4294676.929000] [kobject_put+20/32] kobject_put+0x14/0x20
> [4294676.929000] [put_device+17/32] put_device+0x11/0x20
> [4294676.929000] [scsi_next_command+48/64] scsi_next_command+0x30/0x40
> [4294676.929000] [scsi_end_request+165/192] scsi_end_request+0xa5/0xc0
> [4294676.929000] [scsi_io_completion+540/1152] scsi_io_completion+0x21c/0x480
> [4294676.929000] [scsi_generic_done+43/64] scsi_generic_done+0x2b/0x40
> [4294676.930000] [scsi_finish_command+146/240] scsi_finish_command+0x92/0xf0
> [4294676.930000] [scsi_softirq+215/320] scsi_softirq+0xd7/0x140
> [4294676.930000] [__do_softirq+216/240] __do_softirq+0xd8/0xf0
> [4294676.930000] [do_softirq+74/96] do_softirq+0x4a/0x60
> [4294676.930000] =======================

Which device driver are you using?

This is just a warning - it won't necessarily cause a crash and in this
case it didn't appear to do so.

I seem to recall diagnosing this exact locking problem a month or so ago,
and cc'ing linux-scsi on that analysis.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/