Re: [PATCH v2] x86/apic: Fix modify_irte NULL pointer

From: Thomas Gleixner
Date: Tue Aug 23 2016 - 07:34:21 EST


On Tue, 23 Aug 2016, Wanpeng Li wrote:

> From: Wanpeng Li <wanpeng.li@xxxxxxxxxxx>
>
> native_smp_prepare_cpus
> -> default_setup_apic_routing
> -> enable_IR_x2apic
> -> irq_remapping_prepare
> -> intel_prepare_irq_remapping
> -> intel_setup_irq_remapping
>
> IR table is setup even if noapic boot parameter is added.
>
> As a result:
>
> BUG: unable to handle kernel NULL pointer dereference at (null)
> IP: [<ffffffff8d5a5e58>] modify_irte+0x58/0x140
> PGD 209638067 PUD 2105f4067 PMD 0
> Oops: 0000 [#1] SMP
> RIP: 0010:[<ffffffff8d5a5e58>] [<ffffffff8d5a5e58>] modify_irte+0x58/0x140
> Call Trace:
> intel_ir_set_affinity+0xa3/0xb0
> msi_domain_set_affinity+0x21/0x70
> ? __irq_set_affinity+0x34/0x70
> irq_do_set_affinity+0x1d/0x70
> irq_set_affinity_locked+0xc2/0x100
> __irq_set_affinity+0x47/0x70
> write_irq_affinity.isra.7+0xcc/0xf0
> irq_affinity_proc_write+0x19/0x20
> proc_reg_write+0x3d/0x70
> ? rcu_sync_lockdep_assert+0x2f/0x60
> __vfs_write+0x28/0x120
> ? percpu_down_read+0x5c/0xa0
> ? __sb_start_write+0xca/0xe0
> ? __sb_start_write+0xca/0xe0
> vfs_write+0xb5/0x1b0
> SyS_write+0x49/0xa0
> do_syscall_64+0x81/0x220
> entry_SYSCALL64_slow_path+0x25/0x25
> RIP [<ffffffff8d5a5e58>] modify_irte+0x58/0x140
> RSP <ffff8e9ad01b7c78>
> CR2: 0000000000000000
>
> irqbalance is running at the end of booting and changes the irq affinity,
> then irte is flushed. We should not have MSI and such if apic is disabled.
> This patch fix it by return -ENODEV if apic is disabled in order to avoid
> to setup ir table for ioapic.

I don't see any ENODEV here in the patch. This changelog sucks. The problem is
neither irqbalance nor MSI. The problem is simply that we try to setup irq
remapping even if "noapic" is on the kernel command line and therefor ioapic
support is disabled.

> @@ -154,6 +154,9 @@ void __init default_setup_apic_routing(void)
> {
> int version = apic_version[boot_cpu_physical_apicid];
>
> + if (skip_ioapic_setup)
> + return;

So you skip the whole APIC setup as well. APIC != IOAPIC. This check belongs
into enable_IR_x2apic().

Thanks,

tglx