Re: [GIT pull] x86/asm for 5.1

From: Linus Torvalds
Date: Mon Mar 11 2019 - 11:39:40 EST


On Mon, Mar 11, 2019 at 7:22 AM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>
> > So it basically "pins" the bits on CPU's before they are ready, so the
> > secondary CPU's maghically get the bits set _independently_ of running
> > the actual setup code to set them. That may *work*, but it sure looks
> > iffy.
>
> Which I looked at and it's a non issue. The SMAP/SMEP settings are globally
> the same

No, Thomas, they really aren't globally the same, and it causes
problems in that patch, and particularly in the actual "security" part
of it.

> and the non-boot CPUs set the bits very early in the boot process.

They are set early in the boot process, but they are set separately
for each CPU, and not at the same time.

And that's important. It's important because when the *first* CPU sets
the "you now need to pin and check the SMAP bit", the _other_ CPU"s
have not set it yet.

Why is that a real and big deal?

It's a real and big deal because when the *other* CPU's get around to
booting and do _their_ thing, they don't have their per-cpu bits set
yet, so _as_ they are setting up, they will be doing that

cr4_set_bits(X86_CR4_SMEP/SMAP/UMIP);

dance trying to set one bit at a time, and as they do that, they don't
yet have the pinned bits set.

So what does that result in?

It results in that insanity in the patch where it *first* does the

val |= cr4_pin;

which is entirely insane for three important reasons:

(a) doing that is what causes 99% of the pain with "oh, the compiler
will now have seen that 'val' already has the bits set, so the
compiler might optimize out the test afterwards, so we need to play
extra games".

(b) doing that first is pointless from a ROP use angle, since an
attacker would use the address to after the operation, so now it's
confusing anybody who reads the code into maybe thinking that that
initial "or" has some security meaning

(c) BUT it's actually *bad* for security, because doing that early
'or' means that you lose the warning about possible _real_ mistakes
where somebody hadn't set the bit or screwed up, because you're now
hiding it by setting the bits early,

See? It's a real downside. It causes the code to look odd, and
pointless, and in the process it actually causes *less* verification
coverage and attack warning because it will not warn about somebody
who ended up getting to that "set cr4" part with bits clear by other
tricks (for example, by corrupting the argument, or by just getting
the ROP address from the pvops table for 'set_cr4()').

So it literally makes the patch uglier and *weaker*.

Side note: my suggested version was not as strong as it should have
been either. The actual inline asm should still have the "+r" part to
make sure it actually tests the register value (AFTER the write!) that
it used, and that there's no reload or similar that the compiler did
that causes it to test another value than the one it actually wrote to
%cr3.

So the inline asm should probably be

asm volatile("mov %0,%%cr4":"+r" (val):"memory");

to make it better to actually test 'val' after the write.

Anyway, I repeat: when a patch is trying to fix a really esoteric
security issue, and the *only* reason for a patch is for that actual
attack, then the fix in question should really be held to some much
higher standards than this patch was held to.

When the comments are actively garbage and don't match the code, when
the code doesn't actually do a good job of testing against the attack
despite _trying_ to and actively DOES NOT WARN if somebody got to
there by picking up the pointer to the function, then patch is simply
not a great patch.

Don't make excuses for a bad patch.

Linus