Re: aha1542 oops caused by new request_irq routines

From: Thomas Gleixner
Date: Mon May 31 2010 - 13:19:24 EST


On Mon, 31 May 2010, James Bottomley wrote:
> On Mon, 2010-05-31 at 18:43 +0200, Thomas Gleixner wrote:
> > On Mon, 31 May 2010, James Bottomley wrote:
> >
> > > On Mon, 2010-05-31 at 10:03 -0400, Tedheadster wrote:
> > > > I'm reliably getting this oops:
> > > >
> > > > Configuring Adaptec (SCSI-ID 6) at IO:334, IRQ 10, DMA priority 6
> > > > BUG: sleeping function called from invalid context at mm/slub.c:1598
> > > > in_atomic(): 0, irqs_disabled(): 1, pid: 4782, name: modprobe
> > > > Pid: 4782, comm: modprobe Not tainted 2.6.30.10-105.2.23.RODATA.fc11.i586 #1
> > > > Call Trace:
> > > > [<c0469e58>] ? request_threaded_irq+0x85/0x145
> > > > [<c0422ab7>] __might_sleep+0xc4/0xc9
> > > > [<c04a4322>] kmem_cache_alloc_notrace+0x29/0xb0
> > > > [<c0469e58>] request_threaded_irq+0x85/0x145
> > > > [<d086439c>] ? do_aha1542_intr_handle+0x0/0x2be [aha1542]
> > > > [<d08696aa>] aha1542_detect+0x631/0x76f [aha1542]
> > > > [<d0869841>] init_this_scsi_driver+0x59/0xc7 [aha1542]
> > > > [<d08697e8>] ? init_this_scsi_driver+0x0/0xc7 [aha1542]
> > > > [<c040114b>] do_one_initcall+0x51/0x13f
> > > > [<c0451111>] sys_init_module+0x8b/0x192
> > > > [<c0403535>] syscall_call+0x7/0xb
> > > > scsi5 : Adaptec 1542
> > >
> > > So this one's a bit tricky. aha1542 uses a global spinlock to give it
> > > thread safety and various other things. In this case it's trying to use
> > > the lock to hold off the interrupt until everything is set up.
> > >
> > > Now that we're doing a GFP_KERNEL allocation in the interrupt handler
> > > code you can't disable interrupts while calling request_irq since this
> > > is an old card liable to spurious interrupts as it gets poked in setup.
> > >
> > > I think a possible solution is this, since the mere act of installing an
> > > interrupt handler shouldn't trigger the problem.
> > >
> > > However, I thought the pattern of disabling interrupts and setting up
> > > the handler and registers was a common one ... is there some way this is
> > > supposed to work now that doesn't involve altering the drivers?
> >
> > Most drivers do the sane thing:
> >
> > Disable interrupts at the device level
> > Install handler via request_irq()
> > Setup stuff
> > Enable interrupts at the device level
>
> That only works for some hardware ... a lot of older hardware can't
> disable interrupts; the best you can do is to have the box physically
> not listening to the line.
>
> > So no, there is no way this is supposed to work with drivers which
> > don't follow that simple scheme.
> >
> > commit 0e43785c5 (irq: use GFP_KERNEL for action allocation in
> > request_irq()) changed that particular instance to GFP_KERNEL because
> > the request_irq code calls (and always did) code which cannot be
> > called in atomic contexts, e.g. the proc entry handling.
>
> So, like I said, I think we can install the handler without tickling the
> hardware. Ideally we'd like to install it IRQ_DISABLED and then call
> enable_irq after we're done with the setup, but that doesn't seem to be
> possible.

We have a mechanism in place to do that, but it's not available for
drivers yet. If that's really a requirement, then we can make it
available with very little effort, but that does not resolve the
problem when the interrupt is shared and the interrupt line is already
enabled.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/