Re: x86_64 boot hang when CONFIG_NUMA=n

From: Yinghai Lu
Date: Fri Jun 06 2008 - 01:41:17 EST


On Thu, Jun 5, 2008 at 2:50 PM, Randy Dunlap <randy.dunlap@xxxxxxxxxx> wrote:
> On 2.6.26-rc[2345], I am seeing a hang during boot with CONFIG_NUMA=n, but changing
> to CONFIG_NUMA=y allows successful boot.
>
> This is on a 4-way AMD64 (HP) server with 8 GB RAM.
>
> Using initcall_debug, the last output on a hang is from arch/x86/pci/k8-bus_64.c:
>
> calling early_fill_mp_bus_info+0x0/0x7b2
> node 0 link 1: io port [1000, 3fff]
> node 1 link 2: io port [4000, ffff]
> TOM: 0000000080000000 aka 2048M
> node 0 link 1: mmio [e8000000, fddfffff]
> node 1 link 2: mmio [fde00000, fdffffff]
> node 0 link 1: mmio [80000000, 83ffffff]
> node 1 link 2: mmio [84000000, 8fffffff]
> node 0 link 1: mmio [a0000, bffff]
> TOM2: 0000000280000000 aka 10240M
> bus: [00,3f] on node 0 link 1
> bus: 00 index 0 io port: [0, 3fff]
> bus: 00 index 1 mmio: [90000000, fddfffff]
> bus: 00 index 2 mmio: [80000000, 83ffffff]
> bus: 00 index 3 mmio: [a0000, bffff]
> bus: 00 index 4 mmio: [fe000000, ffffffff]
> bus: 00 index 5 mmio: [280000000, fcffffffff]
> bus: [40,ff] on node 1 link 2
> bus: 40 index 0 io port: [4000, ffff]
> bus: 40 index 1 mmio: [fde00000, fdffffff]
>
>
> There should be an index 2 line printed next, like this slightly modifed for debug
> version does (with CONFIG_NUMA=y), or maybe the following line(s) just aren't
> making it to the (net)console log and some other initcall function is actually
> hanging: (??)
>
> bus: [40,ff] on node 1 link 2
> bus: 40 index 0/3 io port: [4000, ffff]
> bus: 40 index 1/3 mmio: [fde00000, fdffffff]
> bus: 40 index 2/3 mmio: [84000000, 8fffffff]
> early_fill_mp_bus_info: done
>
>
> Has anyone seen something like this? Any patches to test?
>
> The next initcall functions (on a working boot) are:
>
> calling arch_kdebugfs_init+0x0/0x8
> initcall arch_kdebugfs_init+0x0/0x8 returned 0 after 0 msecs
> calling mtrr_if_init+0x0/0x77
> initcall mtrr_if_init+0x0/0x77 returned 0 after 0 msecs
> calling ffh_cstate_init+0x0/0x31
> initcall ffh_cstate_init+0x0/0x31 returned -1 after 0 msecs
> initcall ffh_cstate_init+0x0/0x31 returned with error code -1
> calling acpi_pci_init+0x0/0x4a
> ACPI: bus type pci registered
> initcall acpi_pci_init+0x0/0x4a returned 0 after 0 msecs

can you send out your config?

YH
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/