Re: [PATCH] x86/microcode: Add an option to reload microcode even if revision is unchanged

From: Raj, Ashok
Date: Mon Sep 16 2019 - 20:34:08 EST


On Mon, Sep 16, 2019 at 12:36:11PM +0200, Thomas Gleixner wrote:
> > On Fri, 6 Sep 2019, Raj, Ashok wrote:
> > > On Fri, Sep 06, 2019 at 11:16:00PM +0200, Thomas Gleixner wrote:
> > > > Now #1 is actually a sensible and feasible solution which can be pulled off
> > > > in a reasonably short time frame, avoids all the bound to be ugly and
> > > > failure laden attempts of fixing late loading completely and provides a
> > > > usable and safe solution for joe user, jack admin and the super experts at
> > > > big-cloud corporate.
> > > >
> > > > That is not requiring any new format of microcode payload, as this can be
> > > > nicely done as a metadata package which comes with the microcode
> > > > payload. So you get the following backwards compatible states:
> > > >
> > > > Kernel metadata result
> > > >
> > > > old don't care refuse late load
> > > >
> > > > new No refuse late load
> > > >
> > > > new Yes decide based on metadata
> > > >
> > > > Thoughts?
> > >
> > > This is 100% in line with what we proposed...
> >
> > So what it hindering you to implement that? ucode teams whining about the
> > little bit of extra work?
>
> That said, there is also a distinct lack of information about micro code
> loading in a safe way in general. We absolutely do not know whether a micro
> code update affects any instruction which might be in use during the update
> on a sibling. Right now it's all load and pray and the SDM is not really
> helpful with that either.
>

Guilty as charged :-). In general we do not expect microcode updates to
remove any cpuid bits (Not that it hasn't happened, but it slipped through
the cracks).

microode updates should be of 3 types.

- Only loadable from BIOS (Only via FIT tables)
- Suitable for early load (things that take cpuid bits for e.g.)
- Suitable for late-load. (Where no cpuid bits should change etc).

Today the way we load after a stop_machine() all threads in the system are
held hostage until all the cores have done the update. The thread sibling
is also in the rendezvous loop.

Do you think we still have that risk with a sibling thread?
(Assuming future ucodes don't do weird things like what happened in
that case where a cpuid was removed via an update)

Cheers,
Ashok