On Thu, Oct 29 2009, Kenji Kaneshige wrote:Jens Axboe wrote:On Thu, Oct 29 2009, Kenji Kaneshige wrote:Can I confirm that? (sorry for my poor English skill)Jens Axboe wrote:Nope, it was captured post the power on attempt and the above log dump.On Wed, Oct 28 2009, Kenji Kaneshige wrote:From the console log, it seems that my debug patch worked as I expectedJens Axboe wrote:Here is the output of doing the power on with that patch applied.On Tue, Oct 27 2009, Kenji Kaneshige wrote:Could you try the attached debugging patch? With this patch, powerJens Axboe wrote:The box pretty much hangs when I try to power on a slot with pciehp, soOn Tue, Oct 20 2009, Alex Chiang wrote:I'd like to confirm power fault interrupt storm, just in case.* Jens Axboe <jens.axboe@xxxxxxxxxx>:New board, the exact same thing happens.On Tue, Oct 13 2009, Alex Chiang wrote:Hm, so for some reason, firmware on your machine is telling usIt produces:You mentioned in another mail that you echoed 1 into the variousCan you modprobe acpiphp with debug=1? And send the output?acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5
acpiphp_glue: found PCI-to-PCI bridge at PCI 0000:00:05.0
acpiphp_glue: found ACPI PCI Hotplug slot 1 at PCI 0000:08:00
acpiphp: Slot [1] registered
acpiphp_glue: found PCI-to-PCI bridge at PCI 0000:00:07.0
acpiphp_glue: found ACPI PCI Hotplug slot 2 at PCI 0000:0b:00
acpiphp: Slot [2] registered
acpiphp_glue: found PCI-to-PCI bridge at PCI 0000:80:07.0
acpiphp_glue: found ACPI PCI Hotplug slot 6 at PCI 0000:84:00
acpiphp: Slot [6] registered
acpiphp_glue: found PCI-to-PCI bridge at PCI 0000:80:09.0
acpiphp_glue: found ACPI PCI Hotplug slot 7 at PCI 0000:87:00
acpiphp: Slot [7] registered
acpiphp_glue: Bus 0000:87 has 1 slot
acpiphp_glue: Bus 0000:84 has 1 slot
acpiphp_glue: Bus 0000:0b has 1 slot
acpiphp_glue: Bus 0000:08 has 1 slot
acpiphp_glue: Total 4 slots
slots' power files.
Did you do that after modprobing acpiphp with debug=1?
If so, there should be debug output when you try and turn them
on.
acpiphp: enable_slot - physical_slot = 1
acpiphp_glue: acpiphp_enable_slot: Slot status is not ACPI_STA_ALL
acpiphp: enable_slot - physical_slot = 2
acpiphp_glue: acpiphp_enable_slot: Slot status is not ACPI_STA_ALL
acpiphp: enable_slot - physical_slot = 6
acpiphp_glue: acpiphp_enable_slot: Slot status is not ACPI_STA_ALL
acpiphp: enable_slot - physical_slot = 7
acpiphp_glue: acpiphp_enable_slot: Slot status is not ACPI_STA_ALL
that it doesn't think cards are present and/or enabled.
Unfortunately, I don't know why your firmware would be saying
that. We could add some more debug printks to see what firmware
thinks about your system... Or we could just wait and see what
happens after you get your hardware replaced.
Poke :-)I have a card in one of the slots only this time.No difference in before and after. Odd.
Also, quick dummy check, you are trying to power on populatedYes :-)
slots, right? :)
Can you send the output of lspci -vv? And I like the output ofSend privately.
lspci -vt as well... Both before and after loading acpiphp
please.
If you want to poke us again after your hardware swap, please do
so. Sorry for being not so helpful. :-/
One more thing I tried was pushing the power button on the slot
manually. With acpiphp, I get the same messages as above. Using pciehp,
I get the same power fault bit interrupt storm. So no difference from
using the sysfs interface or doing it on the box side, doesn't work
either way.
Could you get /proc/interrupts information after power fault
problem happens and send it to me?
it's not easy to do... It doesn't hang with acpiphp, but doesn't work
either (see previous reply to Alex).
fault interrupt would be disabled after 100 power fault detected (
I hope so). You can get /proc/interrupts after that.
pciehp 0000:00:05.0:pcie04: enable_slot: physical_slot = 1
pciehp 0000:00:05.0:pcie04: pciehp_get_power_status: SLOTCTRL a8 value read 77b
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 10
pciehp 0000:00:05.0:pcie04: pciehp_power_on_slot: SLOTCTRL a8 write cmd 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 10
pciehp 0000:00:05.0:pcie04: pciehp_green_led_blink: SLOTCTRL a8 write cmd 200
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: Power fault interrupt received
pciehp 0000:00:05.0:pcie04: Power fault on Slot(1)
pciehp 0000:00:05.0:pcie04: Power fault bit 0 set
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
pciehp 0000:00:05.0:pcie04: Data Link Layer Link Active not set in 1000 msec
pciehp 0000:00:05.0:pcie04: pciehp_check_link_status: lnk_status = 1001
pciehp 0000:00:05.0:pcie04: Link Training Error occurs pciehp 0000:00:05.0:pcie04: Failed to check link status
pciehp 0000:00:05.0:pcie04: Command not completed in 1000 msec
pciehp 0000:00:05.0:pcie04: pciehp_set_attention_status: SLOTCTRL a8 write cmd 40
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 12
pciehp 0000:00:05.0:pcie04: pciehp_green_led_off: SLOTCTRL a8 write cmd 300
pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 12
pciehp 0000:00:05.0:pcie04: pciehp_power_off_slot: SLOTCTRL a8 write cmd 400
pciehp 0000:00:05.0:pcie04: Command not completed in 1000 msec
pciehp 0000:00:05.0:pcie04: pciehp_green_led_off: SLOTCTRL a8 write cmd 300
pciehp 0000:00:05.0:pcie04: Command not completed in 1000 msec
pciehp 0000:00:05.0:pcie04: pciehp_set_attention_status: SLOTCTRL a8 write cmd 40
pciehp 0000:00:05.0:pcie04: pciehp_get_power_status: SLOTCTRL a8 value read 779
pciehp 0000:00:05.0:pcie04: pciehp_get_attention_status: SLOTCTRL a8, value read 779
(power fault event interrupts ware disabled after 100 power fault event).
But for some reasons, /proc/interrupts indicates only 5 interrupts of
pciehp. Just in case, did you get /proc/interrupts after doing power on?
The /proc/interrupt was captured *before* the power on attempt and the log.
Correct?
No, the /proc/interrupt output was captured AFTER the power on attempt
and the log capture shown above.