Re: [RFC PATCH 03/12] x86/fpu/xstate: Add xsaves_nmi

From: Liang, Kan
Date: Fri Jun 13 2025 - 10:56:47 EST




On 2025-06-13 10:39 a.m., Dave Hansen wrote:
> On 6/13/25 06:49, kan.liang@xxxxxxxxxxxxxxx wrote:
>> + * This function can only be invoked in an NMI. It returns the *ACTUAL*
>> + * register contents when the NMI hit.
>
> Yes, but why is this important and what are the implications?
>
> It's important because all of the other mechanisms that deal with xstate
> are _trying_ to get something coherent. They're trying to, for instance,
> poke at the PKRU register for userspace and we need to ensure that the
> PKRU value that's being targeted is for the right task and is actually
> in memory (if that's what we're after).
>
> This interface is totally *in*coherent. There's no telling what was in
> the registers when the NMI hit. That seems crazy compared to all the
> other FPU code in the kernel. But it's actually OK for perf because
> there's a separate hardware mechanism that saves XSAVE-managed state off
> to memory. That mechanism also writes whatever was in the registers when
> the NMI hit. It's also completely incoherent.
>
> That's really the only reason this insanity is OK. perf can _already_
> handle XSAVE "snapshots" from random code running. This just provides
> another XSAVE data source at a random time.
>
> Could we get some of that ^ into the changelog and function comment, please?

Sure. Thanks for the details. I will add it in both the changelog and
function comments.

>
> One other thing...
>
> XSAVES uses the modified optimization. That means if you did something
> like this:
>
> NMI=>
> xsaves_nmi();
> <=IRET
> ... run a little bit in the kernel
> NMI=> // another NMI
> xsaves_nmi();
> <=IRET
>
> The second XSAVES might not actually write anything to the buffer
> because the registers didn't change (they weren't modified). Is that OK?

Yes. The per-cpu buffer in perf is only used by this XSAVES. No one will
clear it or modify it between the two xsaves_nmi().

Thanks,
Kan