Re: [PATCH v2 1/2] x86/apic/kexec: Enable legacy irq mode before jump to kexec/kdump kernel

From: Eric W. Biederman
Date: Wed Feb 07 2018 - 12:35:40 EST


ebiederm@xxxxxxxxxxxx (Eric W. Biederman) writes:

> Baoquan He <bhe@xxxxxxxxxx> writes:
>
>> On kvm guest, kernel always prints warning during kdump kernel boots as
>> below.
>>
>> [ 0.001000] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/apic/apic.c:1467 setup_local_APIC+0x228/0x330
>> [ 0.001000] Modules linked in:
>> [ 0.001000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.15.0-rc5+ #3
>> [ 0.001000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1.fc26 04/01/2014
>> [ 0.001000] RIP: 0010:setup_local_APIC+0x228/0x330
>> [ 0.001000] RSP: 0000:ffffffffb6e03eb8 EFLAGS: 00010286
>> [ 0.001000] RAX: 0000009edb4c4d84 RBX: 0000000000000000 RCX: 00000000b099d800
>> [ 0.001000] RDX: 0000009e00000000 RSI: 0000000000000000 RDI: 0000000000000810
>> [ 0.001000] RBP: 0000000000000000 R08: ffffffffffffffff R09: 0000000000000001
>> [ 0.001000] R10: ffff98ce6a801c00 R11: 0761076d072f0776 R12: 0000000000000001
>> [ 0.001000] R13: 00000000000000f0 R14: 0000000000004000 R15: ffffffffffffc6ff
>> [ 0.001000] FS: 0000000000000000(0000) GS:ffff98ce6bc00000(0000) knlGS:0000000000000000
>> [ 0.001000] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 0.001000] CR2: 00000000ffffffff CR3: 0000000022209000 CR4: 00000000000406b0
>> [ 0.001000] Call Trace:
>> [ 0.001000] apic_bsp_setup+0x56/0x74
>> [ 0.001000] x86_late_time_init+0x11/0x16
>> [ 0.001000] start_kernel+0x3c9/0x486
>> [ 0.001000] secondary_startup_64+0xa5/0xb0
>> [ 0.001000] Code: 00 85 c9 74 2d 0f 31 c1 e1 0a 48 c1 e2 20 41 89 cf 4c 03 7c 24 08 48 09 d0 49 29 c7 4c 89 3c 24 48 83 3c 24 00 0f 8f 8f fe ff
>> ff <0f> ff e9 10 ff ff ff 48 83 2c 24 01 eb e7 48 83 c4 18 5b 5d 41
>> [ 0.001000] ---[ end trace b88e71b9a6ebebdd ]---
>> [ 0.001000] masked ExtINT on CPU#0
>>
>> The root cause is the legacy irq mode is disabled before jump to kexec/kdump
>> kernel since commit 522e66464467 ("x86/apic: Disable I/O APIC before shutdown
>> of the local APIC"). In that commit, lapic_shutdown() calling was moved after
>> disable_IO_APIC(). In fact in disable_IO_APIC(), it not only calls
>> clear_IO_APIC() to disable IO-APIC, and also sets LAPIC and IO-APIC to make
>> system be PIC or Virtual wire mode. Hence local APIC is disabled completely
>> by the calling of lapic_shutdown().
>
> The actions of lapic_shutdown do not depend on the actions of
> disable_IO_APIC so this description and justificaiton are nonsense.
>
> Further we don't hardware disable the local APIC except when we hardware
> enable it. And only on 32bit at that.
>
> I keep wondering if the above oops is due to an emulation bug in kvm.
> If that is the case it might be better to fix kvm.

Sigh. Reading a little deeper I see where the local apic is affected.
It is the work of disconnect_bsp_APIC called from disable_IO_APIC.

Calling lapic_shutdown (which clears the local apic) after the local
apic has been placed into virtual wire mode would indeed be a problem.

Now that I see that I agree in essence with this patch series.
I don't agree with the implemenation details.

Can you please split disable_IO_APIC and switch_to_legacy_irq_mode
in a single patch.

In a second patch just perform the code motion and place
switch_to_legacy_irq_mode after lapic_shutdown() where disable_IO_APIC
used to be.

If you do just that the code will make much more sense and will be a
candidate for backporting to stable. As it is a fix for an old
regression.

With a patch title that restores the ordering and achieves this affect
something like: "Fix enabling legacy irq mode in reboot and kexec/kdump"

You subject line above makes it sounds like enabling legacy irq mode is
something new when in fact it is what the code has been trying to do all
along. All that happened is that a bug slipped in, and you are fixing
it.

Eric