[BUG, 4.8-rc7] perf: oops in intel_pmu_enable_all

From: Dave Chinner
Date: Sun Sep 25 2016 - 19:23:06 EST


Hi Folks,

I just upgraded a test VM from 4.8-rc6 to 4.8-rc7, and went to run:

# perf_4.7 top -g -U

inside the VM - the kernel oops with the trace below. The perf
binary was built from a 4.7 kernel, so it's not the latest but it
still shouldn't oops the kernel.

Reproduced multiple times simply by restarting the VM and re-running
the perf command. Looks like a recent regression - this perf binary
worked fine around -rc3/-rc4 - I don't think I ran it on -rc6 at
all.

[ 16.485119] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
[ 16.485122] IP: [<ffffffff8100bffc>] intel_bts_interrupt+0x3c/0x130
[ 16.485123] PGD 0
[ 16.485123] Oops: 0000 [#1] PREEMPT SMP
[ 16.485124] Modules linked in:
[ 16.485124] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.8.0-rc7-dgc+ #975
[ 16.485125] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014
[ 16.485125] task: ffffffff8240b540 task.stack: ffffffff82400000
[ 16.485126] RIP: 0010:[<ffffffff8100bffc>] [<ffffffff8100bffc>] intel_bts_interrupt+0x3c/0x130
[ 16.485126] RSP: 0018:ffff88013bc05bc8 EFLAGS: 00010046
[ 16.485127] RAX: 0000000000000000 RBX: ffff88013bc0b680 RCX: 000000000000038f
[ 16.485127] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000000038f
[ 16.485128] RBP: ffff88013bc05bf0 R08: 000000000000000e R09: 0000000000000000
[ 16.485128] R10: ffff88013bc0aae0 R11: ffff88013bc03ba0 R12: 0000000000000000
[ 16.485129] R13: 00000000ffffffff R14: 00000003d694c86e R15: 0000000000000001
[ 16.485129] FS: 0000000000000000(0000) GS:ffff88013bc00000(0000) knlGS:0000000000000000
[ 16.485130] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 16.485130] CR2: 0000000000000018 CR3: 0000000002406000 CR4: 00000000000406f0
[ 16.485131] Stack:
[ 16.485131] ffff88013bc05ef8 0000000000000000 ffff88013bc0a3c0 00000000ffffffff
[ 16.485131] 00000003d694c86e ffff88013bc05e20 ffffffff8100aed4 0000000000000000
[ 16.485132] 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 16.485132] Call Trace:
[ 16.485133] <NMI> d [<ffffffff8100aed4>] intel_pmu_handle_irq+0x54/0x470
[ 16.485133] [<ffffffff81004b9d>] perf_event_nmi_handler+0x2d/0x50
[ 16.485134] [<ffffffff810730ae>] nmi_handle+0x5e/0x130
[ 16.485134] [<ffffffff81073618>] default_do_nmi+0x48/0x120
[ 16.485134] [<ffffffff810737dc>] do_nmi+0xec/0x140
[ 16.485135] [<ffffffff81c64591>] end_repeat_nmi+0x1a/0x1e
[ 16.485135] [<ffffffff8109f8b6>] ? native_write_msr+0x6/0x30
[ 16.485135] [<ffffffff8109f8b6>] ? native_write_msr+0x6/0x30
[ 16.485136] [<ffffffff8109f8b6>] ? native_write_msr+0x6/0x30
[ 16.485136] <<EOE>> d <IRQ> d [<ffffffff810098bd>] ? __intel_pmu_enable_all+0x4d/0xa0
[ 16.485137] [<ffffffff81009920>] intel_pmu_enable_all+0x10/0x20
[ 16.485137] [<ffffffff810067a1>] x86_pmu_enable+0x261/0x2f0
[ 16.485137] [<ffffffff81192cd2>] perf_pmu_enable+0x22/0x30
[ 16.485138] [<ffffffff81194301>] ctx_resched+0x51/0x60
[ 16.485138] [<ffffffff8119450c>] __perf_event_enable+0x1fc/0x250
[ 16.485139] [<ffffffff8118c90e>] event_function+0xae/0x190
[ 16.485139] [<ffffffff8118dfe0>] ? perf_cgroup_attach+0x50/0x50
[ 16.485139] [<ffffffff8118e01f>] remote_function+0x3f/0x50
[ 16.485140] [<ffffffff811328eb>] flush_smp_call_function_queue+0x7b/0x160
[ 16.485140] [<ffffffff81133443>] generic_smp_call_function_single_interrupt+0x13/0x60
[ 16.485141] [<ffffffff81090ae7>] smp_call_function_single_interrupt+0x27/0x40
[ 16.485141] [<ffffffff81c6510c>] call_function_single_interrupt+0x8c/0xa0
[ 16.485142] <EOI> d [<ffffffff8109f326>] ? native_safe_halt+0x6/0x10
[ 16.485142] [<ffffffff810796fe>] default_idle+0x1e/0x100
[ 16.485142] [<ffffffff81079f1f>] arch_cpu_idle+0xf/0x20
[ 16.485143] [<ffffffff810f49d3>] default_idle_call+0x33/0x40
[ 16.485143] [<ffffffff810f4cbe>] cpu_startup_entry+0x2de/0x340
[ 16.485143] [<ffffffff81c5b864>] rest_init+0x84/0x90
[ 16.485144] [<ffffffff82608f1a>] start_kernel+0x424/0x431
[ 16.485144] [<ffffffff82608120>] ? early_idt_handler_array+0x120/0x120
[ 16.485145] [<ffffffff8260843f>] x86_64_start_reservations+0x2a/0x2c
[ 16.485145] [<ffffffff8260857c>] x86_64_start_kernel+0x13b/0x14a
[ 16.485146] Code: 56 41 55 41 54 53 48 83 ec 08 65 48 03 05 65 e1 ff 7e 48 c7 c3 80 b6 00 00 48 8b 80 30 09 00 00 65 48 03 1d 4f e1 ff 7e 45 31 e4 <48> 8b 48 18 48
[ 16.485146] RIP [<ffffffff8100bffc>] intel_bts_interrupt+0x3c/0x130
[ 16.485147] RSP <ffff88013bc05bc8>
[ 16.485147] CR2: 0000000000000018
[ 16.535644] ---[ end trace 61a930b5078051b0 ]---
[ 16.535644] Kernel panic - not syncing: Fatal exception in interrupt
[ 16.535833] Kernel Offset: disabled

Cheers,

Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx