Re: perf,arm -- oops in validate_event

From: Will Deacon
Date: Tue Aug 06 2013 - 07:59:42 EST


On Tue, Aug 06, 2013 at 12:19:32PM +0100, Mark Rutland wrote:
> On Mon, Aug 05, 2013 at 10:17:37PM +0100, Vince Weaver wrote:
> > It looks like in validate_event() we do
> >
> > struct arm_pmu *armpmu = to_arm_pmu(event->pmu);
> > ...
> > return armpmu->get_event_idx(hw_events, event) >= 0;
> >
> > armpmu is read into r3, and somehow the value at the offset of
> > armpmu->get_event_idx is either -1 or 0, so when it does a "blx"
> > branch to the address at this offset we get the ooops.
> >
> > c001bf8c: e3120010 tst r2, #16
> > c001bf90: 0a000004 beq c001bfa8 <validate_event+0x48>
> > c001bf94: e5933070 ldr r3, [r3, #112] ; 0x70
> > * c001bf98: e12fff33 blx r3
> > c001bf9c: e1e00000 mvn r0, r0
> >
> > I'm having trouble tracing the code back past that, and I don't have time
> > to start adding printk's and recompiling right now.
> >
> > Vince
>
> I think I can save you the effort :)
>
> From the looks of the test case and the kernel code in question, it
> looks like the following happens:
>
> * We create a software event, which becomes its own group leader.
> * We create a hardware event, with the software event as its group
> leader.
> * When we try to schedule the hardware event, we try to validate all
> events in its event group (the leader + siblings), but in doing so we
> treat the software event as a hardware event, and erroneously try to
> get its (non-existent) arm_pmu container, and call some garbage value
> as get_event_idx(...).
>
> This could also happen if we tried to add events from different hardware
> PMUs to the same groups. I'm not sure if that's valid, but I couldn't
> see any code preventing that, and it seems the x86 validation logic is
> wired to allow this. If it's not valid, we could skip validation of
> software events by checking with is_software_event.

But we already check `event->pmu != leader_pmu' in validate_event, so we
shouldn't get anywhere nearer calling get_event_idx in the case you
describe. It sounds more like we have an inconsistency with one of the
events.

Can you dump the events as they're processed in validate_group please?

Will
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/