Re: PROBLEM: consolidated IDT invalidation causes kexec to reboot

From: Linus Torvalds
Date: Tue Dec 26 2017 - 13:51:21 EST


[ Sorry, I was off-line on Christmas Eve due to festivities, and then
yesterday because I've apparently caught a cold.

Still not back to normal, but at least I can sit in front of the
computer again ]

On Mon, Dec 25, 2017 at 1:29 PM, Alexandru Chirvasitu
<achirvasub@xxxxxxxxx> wrote:
>
> On Mon, Dec 25, 2017 at 06:40:14AM -0800, Andy Lutomirski wrote:
>>
>> This is presumably the same call-tracing-without-TLS-working problem.
>> idt_invalidate() is out-of-line and is compiled with full tracing on,

Yeah. The difference literally seems to be that in the old case it was
accidentally inlined.

I say "accidentally", because "load_idt()" itself is explicitly
inlined, but the "set_idt()" in machine_kexec_32.c was not.

But before that commit ("e802a51ede91 x86/idt: Consolidate IDT
invalidation") the compiler inlined it anyway because it was a small
static function.

Afterwards, not so much (different C file), and then the stack tracing
code blew up because of the incomplete CPU state.

> This works.. I went back to the troublesome commit e802a51 and
> modified it as follows:
>
> +/**
> + * idt_invalidate - Invalidate interrupt descriptor table
> + * @addr: The virtual address of the 'invalid' IDT
> + */
> +static inline void idt_invalidate(void *addr)
> +{
> + struct desc_ptr idt = { .address = (unsigned long) addr, .size = 0 };
> +
> + load_idt(&idt);
> +}

Yes, I suspect that is the right thing to do. It's small enough that
inliningh it makes sense.

HOWEVER. Would you mind testing a totally different fix instead?

In particular, take the current top of tree (that doesn't work for
you), and try to just change the order of these two lines:

set_gdt(phys_to_virt(0), 0);
idt_invalidate(phys_to_virt(0));

in arch/x86/kernel/machine_kexec_32.c.

I think it's a better idea to invalidate the IDT first, because that
is only used for exceptions. In contrast, invalidating the GDT will
obviously make any segment load do horrible things, _and_ any
exceptions would fail anyway (because exceptions need segments too).

So in many ways, that "set_get()" that invalidates the GDT is the more
destructive thing, and should be done last.

And if we do it last, maybe the whole "oops, we have tracing code
enabled" thing wouldn't have mattered.

Does that trivial line switching make the old broken config work for you again?

> kexec now works as expected; tested repeatedly, both with direct
> execution and crash triggering.
>
> I had to google 'inline function' :)).

We'll make a kernel developer out of you yet. You've already found the
most important development tool (I kid, I kid. Google is useful, but
"willingness to try things out" is actually the #1 thing).

Mind googling "linux kernel patch submission" and adding the required
sign-off, and I suspect the x86 people will happily take your patch?

That said, I do wonder about a few things:

- the 'addr' argument is pointless, afaik. I *suspect* it used to be
0, and then some mindless editing just changed it to that
"phys_to_virt(0)".

With a zero length, it shouldn't matter what the actual IDT base
address actually is. Any access is going to trap regardless.

- some people were clearly aware of just how critical that whole
"load_idt()" sequence were, because things were marked "inline" and
"NOKPROBE_SYMBOL()" etc, but there was no comment in the code that
actually did this about how the machine state is total garbage after
the "set_gdt()" in machine_kexec().

- the above "I think we should invalidate GDT last" issue.

Hmm?

Linus