Re: [PATCH v5 5/9] x86/nmi: Add support to handle NMIs with source information
From: Peter Zijlstra
Date: Thu May 08 2025 - 08:16:19 EST
On Wed, May 07, 2025 at 02:48:34PM -0700, Sohil Mehta wrote:
> On 5/7/2025 2:14 AM, Peter Zijlstra wrote:
> > On Tue, May 06, 2025 at 06:21:41PM -0700, Sohil Mehta wrote:
> >>
> >> diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
> >> index a1d672dcb6f0..183e3e717326 100644
> >> --- a/arch/x86/kernel/nmi.c
> >> +++ b/arch/x86/kernel/nmi.c
> >
> >> static int nmi_handle(unsigned int type, struct pt_regs *regs)
> >> {
> >> struct nmi_desc *desc = nmi_to_desc(type);
> >> + unsigned long source_bitmap = 0;
> >
> > unsigned long source = ~0UL;
> >
>
> Thanks! This makes the logic even simpler by getting rid of
> match_nmi_source(). A minor change described further down.
>
> Also, do you prefer "source" over "source_bitmap"? I had it as such to
> avoid confusion between source_vector and source_bitmap.
Yeah, I was lazy typing. Perhaps just call it bitmap then?
> >> nmi_handler_t ehandler;
> >> struct nmiaction *a;
> >> int handled=0;
> >> @@ -148,16 +164,40 @@ static int nmi_handle(unsigned int type, struct pt_regs *regs)
> >>
> >> rcu_read_lock();
> >>
> >> + /*
> >> + * Activate NMI source-based filtering only for Local NMIs.
> >> + *
> >> + * Platform NMI types (such as SERR and IOCHK) have only one
> >> + * handler registered per type, so there is no need to
> >> + * disambiguate between multiple handlers.
> >> + *
> >> + * Also, if a platform source ends up setting bit 2 in the
> >> + * source bitmap, the local NMI handlers would be skipped since
> >> + * none of them use this reserved vector.
> >> + *
> >> + * For Unknown NMIs, avoid using the source bitmap to ensure all
> >> + * potential handlers have a chance to claim responsibility.
> >> + */
> >> + if (cpu_feature_enabled(X86_FEATURE_NMI_SOURCE) && type == NMI_LOCAL)
> >> + source_bitmap = fred_event_data(regs);
> >
> > if (cpu_feature_enabled(X86_FEATURE_NMI_SOURCE) && type == NMI_LOCAL) {
> > source = fred_event_data(regs);
> > if (source & BIT(0))
> > source = ~0UL;
> > }
> >
>
> Looks good, except when fred_event_data() returns 0. I don't expect it
> to happen in practice. But, maybe with new hardware and eventually
> different hypervisors being involved, it is a possibility.
>
> We can either call it a bug that an NMI happened without source
> information. Or be extra nice and do this:
>
> if (cpu_feature_enabled(X86_FEATURE_NMI_SOURCE) && type == NMI_LOCAL) {
> source = fred_event_data(regs);
> if (!source || (source & BIT(0)))
> source = ~0UL;
> }
Perhaps also WARN about the !source case?