Re: [PATCH 2/3] perf: arm_spe: Disable buffer before writing to PMBPTR_EL1 or PMBSR_EL1

From: James Clark
Date: Mon Jul 21 2025 - 09:20:30 EST




On 14/07/2025 9:58 am, Leo Yan wrote:
Hi Alexandru,

On Wed, Jul 09, 2025 at 11:08:57AM +0100, Alexandru Elisei wrote:

[...]

case SPE_PMU_BUF_FAULT_ACT_OK:
/*
@@ -679,18 +692,14 @@ static irqreturn_t arm_spe_pmu_irq_handler(int irq, void *dev)
* PMBPTR might be misaligned, but we'll burn that bridge
* when we get to it.
*/
- if (!(handle->aux_flags & PERF_AUX_FLAG_TRUNCATED)) {
+ if (!(handle->aux_flags & PERF_AUX_FLAG_TRUNCATED))
arm_spe_perf_aux_output_begin(handle, event);
- isb();

I am a bit suspecious we can remove this isb().

As a reference to the software usage PKLXF in Arm ARM (DDI 0487 L.a),
after enable TRBE trace unit, an ISB is mandatory. Maybe check a bit
for this?

Wasn't this isb() to separate the programming of the registers with the
status register clear at the end of this function to enable profiling?

Enabling profiling buffer followed an isb() is not only for separating
other register programming.

As described in section D17.9, Synchronization and Statistical Profiling
in Arm ARM:

"A Context Synchronization event guarantees that a direct write to a
System register made by the PE in program order before the Context
synchronization event are observable by indirect reads and indirect
writes of the same System register made by a profiling operation
relating to a sampled operation in program order after the Context
synchronization event."

My understanding is: after the ARM SPE profiling is enabled, the
followed ISB is a Synchronization to make sure the system register
values are observed by SPE. And we cannot rely on ERET, especially if
we are tracing the kernel mode.

Thought about this some more.

Before:

arm_spe_pmu_buf_get_fault_act:
<drain buffer>
ISB
arm_spe_perf_aux_output_begin:
PMBLIMITR_EL1.E = 1
ISB
PMBSR_EL1.S = 0
ERET

Now:

PMBLIMITR_EL1 = 0
ISB

PMBSR_EL1.S = 0
arm_spe_perf_aux_output_begin:
ISB
PMBLIMITR_EL1.E = 1
ERET

I don't see much of a difference between the two sequences - the point after
which we can be sure that profiling is enabled remains the ERET from the
exception return. The only difference is that, before this change, the ERET
synchronized clearing the PMBSR_EL1.S bit, now it synchronizes setting the
PMBLIMITR_EL1.E bit.

Thoughts?

To make the discussion easier, I'll focus on the trace enabling flow
in this reply.

My understanding of a sane flow would be:

arm_spe_pmu_irq_handler() {
arm_spe_perf_aux_output_begin() {
SYS_PMBPTR_EL1 = base;

ISB // Synchronization between SPE register setting and
// enabling profiling buffer.
PMBLIMITR_EL1.E = 1;
}
ISB // Context synchronization event to ensure visibility to SPE
}

... start trace ... (Bottom half, e.g., softirq, etc)

ERET

In the current code, arm_spe_perf_aux_output_begin() is followed by an
ISB, which serves as a context synchronization event to ensure
visibility to the SPE. After that, it ensures that the trace unit will
function correctly.


But I think Alex's point is that in the existing code the thing that finally enables trace (PMBSR_EL1.S = 0) isn't followed by an isb(), only an ERET. So the new flow isn't any different in that regard.

I understand that the Software Usage PKLXF recommends using an ERET as
the synchronization point. However, between enabling the profiling
buffer and the ERET, the kernel might execute other operations (e.g.,
softirqs, tasklets, etc.).

Isn't preemption disabled? Surely that's not possible. Even if something did run it wouldn't be anything that touches the SPE registers, and we're sure there's an isb() between setting up the buffer and the final PMBLIMITR_EL1.E = 1


Therefore, it seems to me that using ERET as the synchronization point
may be too late. This is why I think we should keep an ISB after
arm_spe_perf_aux_output_begin().

Thanks,
Leo

Wouldn't that make the ERET too late even in the current code then? But I think we're agreeing there's no issue there?

James