Linux 2.5.24 and aic7xxx oopsing in ahc_linux_isr on system with LOTs of SCSI cards.

From: Gross, Mark (mark.gross@intel.com)
Date: Thu Jun 27 2002 - 11:54:41 EST


I have a box with a LOT of SCSI adapters that isn't happy with 2.5 kernels,
but is solid with the 2.4 kernels.

I'm trying to test the 2.5.24 on my block I/O system with effectively 8
aic7xxx adapters (one on board, 3 dual channel PCI cards, and 1 single
channel PCI card).

It panics with a NULL dereference in the ahc_linux_isr in aic7xxx_linux.c

I found the following in the archives and I wonder if I'm getting a similar
failure. What is the status for the fix to this problem?

I'm using the new AIC7xxx support from make menuconfig for my 2.5.24 kernel.

--mgross

On Thu, Feb 28, 2002 at 05:39:36PM -0700, Justin T. Gibbs wrote:
>
>> >The irq is enabled (request_irq called) via ahc_linux_pci_probe;
>> >host is initialized via ahc_linux_register_host, see ahc_linux_detect.
>>
>> Actually, the interrupt is enabled in the driver PCI core in
aic7xxx_pci.c
>> (looking at v6.2.5 of the driver). It seems that this will have to move
>> out of common code and into the OSM to satisfy Linux.
>
>Can't we call the interrupt handler for a shared irq? So, even if the
>aic interrupt is disabled, its interrupt handler can be called?

The call is out to the OSM to register the handler. This is only
performed once all setup is already performed, so a shared interrupt
is handled correctly. After the registration, the aic7xxx's interrupt
source is enabled.

>
>> >Anyone have a fix for this problem? The correct fix looks more complex
>> >than just moving up the ahc_linux_register_host() calls.
>> >
>> >For now, I'll try modifying ahc_linux_isr() to return early if host is
NULL
>.
>
>The above is working for me for now.

Unless this has changed in 2.5, the crux of the problem is that you only
get your host structure once you've registered your controller instance.
In the process of registering the controller, you can be called back to
handle commands. If you can't preallocate the host structure prior to
registering it to the system, using a lock in side of it will cause these
kinds of issues. Checking for a NULL host structure on every interrupt is
bogus.

--
Justin
-

(W) 503-712-8218 MS: JF1-05 2111 N.E. 25th Ave. Hillsboro, OR 97124 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Jun 30 2002 - 22:00:12 EST