BUGS: kernel failed to boot

From: kzt
Date: Wed Mar 04 2015 - 21:05:26 EST



Kernel 4.0.0 failed to boot (see opps below). By bisecting, I found
the commit 8329aa9fff causes kernel to crash. In a case
CONFIG_HYPERVISOR_GUEST is disabled, hypervisor_x2apic_available()
simply returns false, so x2apic_disable() is unconditionally called.
Before 8329aa9fff, x2apic_disable() is not called at all if the
option is disabled. Can we simply revert this back? I'm not sure how
this causes the suspend to fail on Chromebook Pixel mentioned in
8329aa9fff. I tested this on a 18-core Haswell node (2699v3).

// 8329aa9fff
static __init void try_to_enable_x2apic(int remap_mode)
....
if (max_physical_apicid > 255 ||
!hypervisor_x2apic_available()) {
pr_info("x2apic: IRQ remapping doesn't support X2APIC mode\n");
x2apic_disable();
return;
}


// 8329aa9fff^
if (max_physical_apicid > 255 ||
(IS_ENABLED(CONFIG_HYPERVISOR_GUEST) &&
!hypervisor_x2apic_available())) {


commit 8329aa9fff3fca84009e6a444d8d160193643bac
Author: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Date: Fri Feb 13 10:26:18 2015 -0800

Revert "x86/apic: Only disable CPU x2apic mode when necessary"

This reverts commit 5fcee53ce705d49c766f8a302c7e93bdfc33c124.

It causes the suspend to fail on at least the Chromebook Pixel, possibly
other platforms too.

Joerg Roedel points out that the logic should probably have been

if (max_physical_apicid > 255 ||
!(IS_ENABLED(CONFIG_HYPERVISOR_GUEST) &&
hypervisor_x2apic_available())) {

instead, but since the code is not in any fast-path, so we can just live
without that optimization and just revert to the original code.

--------------

BUG: unable to handle kernel paging request at ffffffffff57c020
IP: [<ffffffff8108cc33>] native_apic_mem_read+0x3/0x10
PGD 1917067 PUD 1919067 PMD 191a067 PTE 0
Oops: 0000 [#1] SMP scue-875b91a9dac14875a0b57d8325e02ad9
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.0.0-rc2-00150-g6587457 #3
Hardware name: Supermicro X10DRi/X10DRi, BIOS 1.0b 09/17/2014
task: ffff88205bda0000 ti: ffff88085c244000 task.ti: ffff88085c244000
RIP: 0010:[<ffffffff8108cc33>] [<ffffffff8108cc33>] native_apic_mem_read+0x3/0x10
RSP: 0000:ffff88085c247e08 EFLAGS: 00010246mand prompt.
RAX: ffffffff81932980 RBX: ffffffff81a4b080 RCX: 000000000000051c
RDX: 0000000000000000 RSI: 0000000000000246 RDI: 0000000000000020
RBP: ffff88085c247e18 R08: 000000000000000a R09: 0000000000000001
R10: 00000000000001c1 R11: ffff88085c247ace R12: 000000000000a0e8
R13: 000000000000a0f0 R14: 0000000000000048 R15: 0000000000000047
FS: 0000000000000000(0000) GS:ffff88085f800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffff57c020 CR3: 0000000001916000 CR4: 00000000001406f0
Stack:
ffff88085c247e18 ffffffff81083436 ffff88085c247e68 ffffffff81a7e65d
0000000000000000 0000200042ee3151 0000000000000000 ffffffff81bcb5b8
0000000000000000 0000000000000000 0000000000000000 0000000000000000
Call Trace:
[<ffffffff81083436>] ? read_apic_id+0x16/0x30
[<ffffffff81a7e65d>] native_smp_prepare_cpus+0x23f/0x2a4
[<ffffffff81a6d25c>] kernel_init_freeable+0xf7/0x258
[<ffffffff815e2240>] ? rest_init+0x80/0x80
[<ffffffff815e224e>] kernel_init+0xe/0xf0
[<ffffffff815f777c>] ret_from_fork+0x7c/0xb0
[<ffffffff815e2240>] ? rest_init+0x80/0x80
Code: 0f 1f 84 00 00 00 00 00 48 83 c4 18 5b 41 5c 41 5d 41 5e 41 5f 5d c3 90 55 89 ff 48 89 e5 89 b7 00 c0 57 ff 5d c3 66 90 55 89 ff <8b>\
87 00 c0 57 ff 48 89 e5 5d c3 66 90 55 48 8b 05 60 e5 59 00
RIP [<ffffffff8108cc33>] native_apic_mem_read+0x3/0x10
RSP <ffff88085c247e08>
CR2: ffffffffff57c020
---[ end trace 9ae9c0d0e009ecc2 ]---
Kernel panic - not syncing: Fatal exception
---[ end Kernel panic - not syncing: Fatal exception
random: nonblocking pool is initialized

- kaz

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/