Re: Fw: crash on x86_64 - mm related?

From: James Bottomley
Date: Mon Dec 12 2005 - 13:24:47 EST


On Mon, 2005-12-12 at 10:09 -0800, Linus Torvalds wrote:
> Well, that patch is definitely broken.

No, it's not; all it's doing is deferring the device_put() from the
scsi_put_command() to after the scsi_run_queue(), which doesn't fix the
sleep while atomic problem of the device release method. In both cases
we still get the semaphore in atomic context problem which is caused by
scsi_reap_target() doing a device_del(), which I assumed (wrongly) was
valid from atomic context.

I'll fix the scsi_reap_target(), but it's nothing to do with the patch
you reversed.

> You say that it just causes a warning about sleeping in interrupt context,
> while I say that the warning is a serious error. If that semaphore _ever_
> is write-locked, the whole machine will crash from trying to sleep when it
> cannot sleep.
>
> So I can certainly undo the undo, but the fact is, the code is CRAP. I'd
> much rather get a real fix instead of having to select between two known
> bugs.

I'll find a fix for the real problem, but this patch isn't the cause.

James


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/