Re: [PATCH v2] coresight: tmc-etr: Fix perf_data check.

From: Mathieu Poirier
Date: Thu Aug 15 2019 - 13:22:50 EST


On Wed, 14 Aug 2019 at 17:51, Yabin Cui <yabinc@xxxxxxxxxx> wrote:
>
> > Did you actually see the check fail or is this a theoretical thing?
> > I'm really perplex here has I have tested this scenario many times
> > without issues.
> >
> I have seen this warning in dmesg output, that's how I find the problem.
>
> > In CPU wide scenarios each perf event (one per CPU) is associated with
> > an event_data during the setup process. The event_data is the
> > etr_perf holding a reference to the perf ring buffer for that specific
> > event along with the etr_buf, regardless of who created the latter.
>
> Agree.
>
> > From there, when the event is installed on a CPU, the csdev for that
> > CPU is given a reference to the event_data of that event[1]. Before
> > going further notice how there is a per CPU csdev and event handle to
> > keep track of event specifics[2]. As such both (per CPU) csdev and
> > event handle carry the exact same reference to the etr_perf.
> >
> On my test device (Pixel 3), there is an ETM device on each cpu, but only
> one ETR device for the whole device. So there is only one instance of etr
> csdev in the kernel. If multiple cpus are scheduling on etm perf events at
> the same time, all of them are trying to set their event_data to the same
> etr csdev. And different perf events have different event_data. A warning
> situation is as below:
>
> cpu 0
> schedule on event A (set etr csdev->perf_data to event_a.etr_perf)
>
> cpu 1
> schedule on event B (set etr csdev->perf_data to event_b.etr_perf)
>

You are 100% right and looking at it again this morning it just jumped
at me. I simply can't understand how it did not manifest itself
during all the hammering I did on it.

Please see details in my other (and upcoming) email.

Thanks,
Mathieu

> cpu 1
> schedule off event B (update buffer, does nothing since csdev->refcnt != 1)
>
> cpu 0
> schedule off event A (update buffer, but etr csdev->perf_data check fail)