Re: 2.6.0: Badness in pci_find_subsys!!

From: Valdis.Kletnieks@vt.edu
Date: Thu Jul 24 2003 - 22:45:29 EST


On Thu, 24 Jul 2003 13:26:01 EDT, Douglas J Hunley <doug@hunley.homeip.net> said:

> Just had my athlon box lock-up solid. needed SysRq to reboot the thing..
> kernel info follows:
> Jul 24 13:08:23 doug kernel: Badness in pci_find_subsys at
> drivers/pci/search.c:132
> Jul 24 13:08:23 doug kernel: Call Trace:
> Jul 24 13:08:23 doug kernel: [<c02064a1>] pci_find_subsys+0x111/0x120
> Jul 24 13:08:23 doug kernel: [<c02064df>] pci_find_device+0x2f/0x40
> Jul 24 13:08:23 doug kernel: [<c0206368>] pci_find_slot+0x28/0x50
> Jul 24 13:08:23 doug kernel: [<f8a2ada4>] os_pci_init_handle+0x3a/0x67

The 'badness in pci_find_subsys' may not be related to your hang.

The NVidia msgs are basically caused by the fact that pci_find_slot() is
getting called in an interrupt, so we trigger the WARN_ON in pci_find_subsys().
The worry here is that we may be walking the PCI list on the interrupt side
while something else is hotplugging a new device into existence, causing it to
walk off the end of a inconsistent list. Unless you actually crapped out right
at 13:08:23, it's probably unrelated.

(I was getting the same NVidia traceback on a regular basis (3-4 at every start
of the X server, and 1 at X server shutdown) under 2.5.72-mm3, they stopped
when I went to 2.5.73-mm1. If you're still seeing them in 2.6.0-test1, I would
suspect something different in the -mm series is fixing them for me - first place
to look is what got added between 72-mm3 and 73-mm1.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Jul 31 2003 - 22:00:24 EST