Re: [LKP] [x86/smpboot] f5d6a52f511: BUG: kernel boot hang

From: Ingo Molnar
Date: Wed May 13 2015 - 02:47:44 EST



* Huang Ying <ying.huang@xxxxxxxxx> wrote:

> FYI, we noticed the below changes on
>
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/apic
> commit f5d6a52f511157c7476590532a23b5664b1ed877 ("x86/smpboot: Skip delays during SMP initialization similar to Xen")
>
>
> +------------------------------------------------+------------+------------+
> | | 19e3d60d49 | f5d6a52f51 |
> +------------------------------------------------+------------+------------+
> | boot_successes | 20 | 10 |
> | boot_failures | 2 | 12 |
> | IP-Config:Auto-configuration_of_network_failed | 2 | 2 |
> | BUG:kernel_boot_hang | 0 | 10 |
> +------------------------------------------------+------------+------------+
>
>
> [ 0.000000] Initializing CPU#1
> [ 1.586595] kvm-clock: cpu 1, msr 0:13fdf041, secondary cpu clock
>
> BUG: kernel boot hang
> Elapsed time: 305
> qemu-system-i386 -enable-kvm -kernel /pkg/linux/i386-randconfig-c0-05111038/gcc-4.9/be67584d15684730aeed88cab355c5de8b0491fe/vmlinuz-4.1.0-rc3-01147-gbe67584 -append 'root=/dev/ram0 user=lkp job=/lkp/scheduled/vm-kbuild-yocto-i386-3/rand_boot-1-yocto-minimal-i386.cgz-i386-randconfig-c0-05111038-be67584d15684730aeed88cab355c5de8b0491fe-1-20150512-31766-1fzr1qi.yaml ARCH=i386 kconfig=i386-randconfig-c0-05111038 branch=linux-devel/devel-cairo-smoke-201505120219 commit=be67584d15684730aeed88cab355c5de8b0491fe BOOT_IMAGE=/pkg/linux/i386-randconfig-c0-05111038/gcc-4.9/be67584d15684730aeed88cab355c5de8b0491fe/vmlinuz-4.1.0-rc3-01147-gbe67584 max_uptime=600 RESULT_ROOT=/result/boot/1/vm-kbuild-yocto-i386/yocto-minimal-i386.cgz/i386-randconfig-c0-05111038/gcc-4.9/be67584d15684730aeed88cab355c5de8b0491fe/0 LKP_SERVER=inn earlyprintk=ttyS0,115200 systemd.log_level=err debug apic=debug sysrq_always_enabled rcupdate.rcu_cpu_stall_timeout=100 panic=-1 softlockup_panic=1 nmi_watchdog=panic oops=panic load_ramdisk=2 prompt_ramdisk=0 console=ttyS0,115200 console=tty0 vga=normal rw ip=::::vm-kbuild-yocto-i386-3::dhcp drbd.minor_count=8' -initrd /fs/sdc1/initrd-vm-kbuild-yocto-i386-3 -m 320 -smp 2 -device e1000,netdev=net0 -netdev user,id=net0 -boot order=nc -no-reboot -watchdog i6300esb -rtc base=localtime -drive file=/fs/sdc1/disk0-vm-kbuild-yocto-i386-3,media=disk,if=virtio -pidfile /dev/shm/kboot/pid-vm-kbuild-yocto-i386-3 -serial file:/dev/shm/kboot/serial-vm-kbuild-yocto-i386-3 -daemonize -display none -monitor null

Hm, so in hindsight the commit, contrary to the changelog, not only
changed delays, but also changed the APIC_DM_INIT logic from:

...
apic_icr_write(APIC_INT_LEVELTRIG | APIC_DM_INIT, phys_apicid);
...

to:

if (!cpu_has_x2apic) {
...
apic_icr_write(APIC_INT_LEVELTRIG | APIC_DM_INIT, phys_apicid);
...
}

i.e. in the x2apic case it not only skips the delays, but skips the
INIT IPI deassertion as well!

So I think this change was poorly tested (and the semantic change
slipped through my review as well), in a very fragile piece of
historic code, so I've reverted it.

Len's 10 msec delay optimization for modern x86 CPUs is kept intact.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/