Re: [RFC PATCH V2 05/13] perf/x86: Support XMM register for non-PEBS and REGS_USER

From: Liang, Kan
Date: Fri Jun 27 2025 - 17:23:40 EST




On 2025-06-27 10:35 a.m., Dave Hansen wrote:
> On 6/26/25 12:56, kan.liang@xxxxxxxxxxxxxxx wrote:
>> +static void x86_pmu_get_ext_regs(struct x86_perf_regs *perf_regs, u64 mask)
>> +{
>> + struct xregs_state *xsave = per_cpu(ext_regs_buf, smp_processor_id());
>> +
>> + if (WARN_ON_ONCE(!xsave))
>> + return;
>> +
>> + xsaves_nmi(xsave, mask);
>
> This makes me a little nervous.
>
> Could we maybe keep a mask around that reminds us what 'ext_regs_buf'
> was sized for and then ensure that no bits in the passed-in mask are set
> in that?
>

The x86_pmu.ext_regs_mask tracks the available bits of
x86_pmu.ext_regs_buf. But it has its own format.
I will make it use the XSAVE format, and add a check here.


> I almost wonder if you want to add a
>
> struct fpu_state_config fpu_perf_cfg;
>
> I guess it's mostly overkill for this. But please do have a look at the
> data structures in:
>
> arch/x86/include/asm/fpu/types.h
>

It looks overkill. The perf usage is simple. It should be good enough to
have one mask to track the available bits. The size is from FPU's
xstate_calculate_size(). I think, as long as perf inputs the correct
mask, the size can be trusted.

>> + if (mask & XFEATURE_MASK_SSE &&
>> + xsave->header.xfeatures & BIT_ULL(XFEATURE_SSE))
>> + perf_regs->xmm_space = xsave->i387.xmm_space;
>> +}
>
> There's a lot going on here.
>
> 'mask' and 'xfeatures' have the exact same format. Why use
> XFEATURE_MASK_SSE for one and BIT_ULL(XFEATURE_SSE) for the other?
>

Ah, my bad. The same XFEATURE_MASK_SSE should be used.
> Why check both? How could a bit get into 'xfeatures' without being in
> 'mask'?

The 'mask' is what perf wants/configures. I think the 'xfeatures' is
what XSAVE really gives. I'm not quite sure if HW can always give us
everything we configured. If not, I think both checks are required.

I'm thinking to add the below first.

valid_mask = x86_pmu.ext_regs_mask & mask & xsave->header.xfeatures;

Then only use the valid_mask to check each XFEATURE.

>
> How does the caller handle the fact that ->xmm_space might be written or
> not?
>

For this series, the returned XMM value is zeroed if the ->xmm_space is
NULL.
But I should clear the nr_vectors. So nothing will be dumped to the
userspace if the ->xmm_space is not available. I will address it in V3.

Thanks,
Kan