Re: [PATCH] x86/mce: Do away with unnecessary context quirks
From: Yazen Ghannam
Date: Thu Aug 14 2025 - 11:44:57 EST
On Wed, Aug 13, 2025 at 02:20:26PM -0700, Luck, Tony wrote:
> On Wed, Aug 13, 2025 at 03:44:55PM +0000, Yazen Ghannam wrote:
>
> > -static noinstr void mce_gather_info(struct mce_hw_err *err, struct pt_regs *regs)
> > +static noinstr void mce_gather_info(struct mce_hw_err *err)
> > {
> > - struct mce *m;
> > /*
> > * Enable instrumentation around mce_prep_record() which calls external
> > * facilities.
> > @@ -467,29 +466,7 @@ static noinstr void mce_gather_info(struct mce_hw_err *err, struct pt_regs *regs
> > mce_prep_record(err);
> > instrumentation_end();
> >
> > - m = &err->m;
> > - m->mcgstatus = mce_rdmsrq(MSR_IA32_MCG_STATUS);
> > - if (regs) {
> > - /*
> > - * Get the address of the instruction at the time of
> > - * the machine check error.
> > - */
> > - if (m->mcgstatus & (MCG_STATUS_RIPV|MCG_STATUS_EIPV)) {
> > - m->ip = regs->ip;
> > - m->cs = regs->cs;
> > -
> > - /*
> > - * When in VM86 mode make the cs look like ring 3
> > - * always. This is a lie, but it's better than passing
> > - * the additional vm86 bit around everywhere.
> > - */
> > - if (v8086_mode(regs))
> > - m->cs |= 3;
> > - }
> > - /* Use accurate RIP reporting if available. */
> > - if (mca_cfg.rip_msr)
> > - m->ip = mce_rdmsrq(mca_cfg.rip_msr);
> > - }
>
> You moved an abbrevated vesion of this code from mce_gather_info() ...
>
> > static noinstr int error_context(struct mce *m, struct pt_regs *regs)
> > {
> > int fixup_type;
> > bool copy_user;
> >
> > - if ((m->cs & 3) == 3)
> > + /* Without register info, assume the worst. */
> > + if (!regs)
> > + return IN_KERNEL;
> > +
> > + m->ip = regs->ip;
> > + m->cs = regs->cs;
>
> ... to here in error_context().
>
> Would it work to hoist the error_context() code into mce_gather_info()
> and have it set a new mce::error_context field?
Yes, maybe. I do think we should get the severity once. Maybe this means
splitting context, action, message, etc., into separate helper functions
rather than all done in mce_severity().
>
> I ask because mce_gather_info() is called once, while error_context()
> is called multiple times (on Intel ... not sure of flow on AMD).
>
Right, because of each mce_severity() call. Intel has multiple because
of the monarch reign flow. Those additional calls are just to get the
"message" for printing.
I have another revision with a more minimal diff. It'll just cover the
quirks case and leaves out the changes for VM86 and mce_severity() for
now.
Thanks,
Yazen