Re: [PATCH] x86/microcode: Add an option to reload microcode even if revision is unchanged

From: Raj, Ashok
Date: Fri Sep 06 2019 - 20:38:09 EST


On Fri, Sep 06, 2019 at 11:16:00PM +0200, Thomas Gleixner wrote:
>
> So if we want to do late microcode loading in a sane way then there are
> only a few options and none of them exist today:
>
> 1) Micro-code contains a description of CPUID bits which are going to be
> exposed after the load. Then the kernel can sanity check whether this
> changes anything relevant or not. If there is a relevant change it can
> reject the load and tell the admin that a reboot is required.

This is pretty much what we had in mind when we suggested to the uCode teams.

Just a process of providing a meta data file to accompany every uCode release.

IMO new cpuid bits are probably less harmful than old ones dissappearing.



>
> 2) Rework CPUID feature handling so that it can reevaluate and reconfigure
> the running system safely. There are a lot of things you need for that:
>
> A) Introduce a safe state for CPUs to reach which guarantees that none
> of the CPUs will return from that state via a code path which
> depends on previous state and might now go the other route with data
> on the stack which only fits the previous configuration.
>
> B) Make all the cpufeature thingies run time switchable. That means
> that you need to keep quite some code around which is currently init
> only. That also means that you have to provide backout code for
> things which set up data corresponding to cpu feature bits and so
> forth.
>
> So #2 might be finished in about 20 years from now with the result that
> some of the code pathes might simply still have a

Maybe we can catch the kernel side in 20 years.. user space would still be
busted, or have a fault way to control new cpuid much like how we do for
VM's.

>
> if (cpufeature_changed())
> panic();
>
> because there are things which you cannot back out. So the only sane
> solution is to panic. Which is not a solution as it would be much more sane
> to prevent late loading upfront and force people to reboot proper.
>
> Now #1 is actually a sensible and feasible solution which can be pulled off
> in a reasonably short time frame, avoids all the bound to be ugly and
> failure laden attempts of fixing late loading completely and provides a
> usable and safe solution for joe user, jack admin and the super experts at
> big-cloud corporate.
>
> That is not requiring any new format of microcode payload, as this can be
> nicely done as a metadata package which comes with the microcode
> payload. So you get the following backwards compatible states:
>
> Kernel metadata result
>
> old don't care refuse late load
>
> new No refuse late load
>
> new Yes decide based on metadata
>
> Thoughts?

This is 100% in line with what we proposed...

Cheers,
Ashok