Re: [PATCH 09/11] perf/x86/intel: Support task events with Intel CQM

From: Matt Fleming
Date: Fri Nov 07 2014 - 05:10:15 EST


On Fri, 07 Nov, at 10:08:04AM, Peter Zijlstra wrote:
>
> How is that supposed to work? You call __intel_cqm_event_count() on the
> one cpu per socket, but then you use a local_add, not an atomic_add,
> even though these adds can happen concurrently as per IPI broadcast.

Ouch, right. That's broken.

> Also, I think smp_call_function_many() ignores the current cpu, if this
> cpu happens to be the cpu for this socket, you're up some creek without
> no paddle, right?

OK, I didn't realise that. Yeah that sounds very problematic. I think my
eyes skipped over the word "other" in the smp_call_function_many() docs,

* smp_call_function_many(): Run a function on a set of other CPUs.

So, the correct way to do this is to iterate over cqm_cpumask and invoke
smp_call_function_single(), right?

> Thirdly, there is no serialization around calling perf_event_count() [or
> your pmu::count method] so you cannot temporarily put it to 0.

Urgh, thanks. Good spot. I'm gonna have to think of a suitable
serialisation mechanism because all the current ones are pretty
heavy-handed. And of course, there's the added fun that it needs to be
held across the IPIs.

Perhaps a per-cache-group mutex?

--
Matt Fleming, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/