Re: [GIT PULL] perf fixes

From: Linus Torvalds
Date: Thu Mar 14 2013 - 17:06:08 EST

On Thu, Mar 14, 2013 at 1:32 PM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> And to make things interesting, I seem to be able to only reproduce
> this *after* a suspend cycle. That may be just happenstance, since it
> seemed to be hard to replicate and most of the time it has happened
> under X with no messages visible at all, but that *seems* to be the
> pattern.
> And the one time I got it to happen on the text console, things
> scrolled off (watchdog warnings due to lockups), but I did get a NULL
> pointer dereference in intel_pmu_enable_all().
> I'll try to reproduce it and get a picture,

Theory more or less confirmed.

It does need a suspend/resume cycle, and I have a picture. The oops
happens immediately when trying to do any perf work after the first
suspend, before suspending I seem to be able to reliably use perf. It
could still be just random flakiness, but I don't think so.

The NULL pointer dereference is at intel_pmu_enable_all+0x4d/0xa0 for
me, which seems to be the load of the

if (test_bit(INTEL_PMC_IDX_FIXED_BTS, cpuc->active_mask))

thing. It says

BUG: unable to handle NULL pointer dereference at 0000000000000028

But that error makes no sense. The code at that EIP is

48 8b 83 00 02 00 00 mov 0x200(%rbx),%rax <-- trapping instruction

and the value printed out for %rbx is 0xffff80014f20b8e0, so it should
*not* be a NULL pointer dereference (and "cpuc" was also used just
before the wrmsrl).

So I suspect that the "wrmsrl" that was just before that instruction
does something odd, and the PMU is in some odd state, so that the NULL
pointer dereference actually has something to do with *that*, rather
than the instruction itself.

The callchain looks normal. It's

finish_task_switch ->
__perf_event_task_sched_in ->
perf_event_context_sched_in ->
perf_pmu_enable ->
x86_pmu_enable ->

The immediately preceding wrmsrl was done with rax=0xf, rdx=0x7,
rcx=0x38f according to the register dump (but the picture isn't great,
so the numbers aren't 100% reliable).

Does this give any clues?

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at