Re: [PATCH v2] x86: fix early boot crash on gcc-10

From: Borislav Petkov
Date: Fri Apr 17 2020 - 05:09:19 EST


On Fri, Apr 17, 2020 at 10:58:59AM +0200, Jakub Jelinek wrote:
> On Fri, Apr 17, 2020 at 10:42:24AM +0200, Borislav Petkov wrote:
> > On Fri, Apr 17, 2020 at 10:07:26AM +0200, Jakub Jelinek wrote:
> > > If you want minimal changes, you can as I said earlier either
> > > mark cpu_startup_entry noreturn (in the declaration in some header so that
> > > smpboot.c sees it), or you could add something after the cpu_startup_entry
> > > call to ensure it is not tail call optimized (e.g. just
> > > /* Prevent tail call to cpu_startup_entry because the stack
> > > protector guard has been changed in the middle of this function
> > > and must not be checked before tail calling another function. */
> > > asm ("");
> >
> > That sounds ok-ish to me too.
> >
> > I know you probably can't tell the future :) but what stops gcc from
> > doing the tail-call optimization in the future?
> >
> > Or are optimization decisions behind an inline asm a no-no and will
> > pretty much always stay that way?
>
> GCC intentionally treats asm as a black box, the only thing which it does
> with it is: non-volatile asm (but asm without outputs is implicitly
> volatile) can be CSEd, and if the compiler needs to estimate size, it
> uses some heuristics by counting ; and newlines.
> And it will stay this way.
>
> > And I hope the clang folks don't come around and say, err, nope, we're
> > much more aggressive here.
>
> Unlike GCC, I think clang uses the builtin assembler to parse the string,
> but don't know if it still treats the asms more like black boxes or not.
> Certainly there is a lot of code in the wild that uses inline asm
> as optimization barriers, so if it doesn't, then it would cause a lot of
> problems.
>
> Or go with the for (;;);, I don't think any compiler optimizes those away;
> GCC 10 for C++ can optimize away infinite loops that have some conditional
> exit because the language guarantees forward progress, but the C language
> rules are different and for unconditional infinite loops GCC doesn't
> optimize them away even if explicitly asked to -ffinite-loops.

Lemme add Nick for clang for an opinion:

Nick, we're discussing what would be the cleanest and future-proof
way to disable stack protector for the function in the kernel which
generates the canary value as gcc10 ends up checking that value due to
tail-call optimizing the last function called by start_secondary()...
upthread are all the details.

And question is, can Jakub's suggestions above prevent tail-call
optimization on clang too and how reliable and future proof would that
be if we end up going that way?

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette