Re: [RFC PATCH 08/12] perf/x86: Add APX in extended regs

From: Liang, Kan
Date: Fri Jun 13 2025 - 13:17:24 EST




On 2025-06-13 12:02 p.m., Dave Hansen wrote:
> On 6/13/25 06:49, kan.liang@xxxxxxxxxxxxxxx wrote:
>> +#define __x86_pmu_get_regs(_mask, _regs, _size) \
>> +do { \
>> + if (mask & _mask && xcomp_bv & _mask) { \
>> + _regs = xsave; \
>> + xsave += _size; \
>> + } \
>> +} while (0)
>
> Ewww.
>
> First of all, this doesn't work generally because of the previously
> mentioned alignment. Second, it's using xcomp_bv which doesn't tell you
> if XSAVES wrote the data.
>
> Last, this attempts to reimplement get_xsave_addr().
>
> I'd do something like this:
>
> for (xfeature_nr in mask) {
> void *src = get_xsave_addr(xsave, xfeature_nr);
> void *dst = ... a function to map XFEATURE_MASK_APX
> to perf_regs->apx_regs,
> int size = xstate_sizes(xfeature_nr);
>
> if (!src)
> continue;
>
> memcpy(dst, src, size);

The data will eventually be copied to a buffer which shared with the
user-space tool. I probably have to avoid the memcpy and do it later
when outputting the sample.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/events/core.c#n7389

> }
>
> That should handle *all* of the nastiness. The alignment, the init
> optimization. *Please* use get_xsave_addr() or one of the other helpers.

Thanks, I will try to reuse the existing fpu functions as much as I can.

Thanks,
Kan