Re: Linux 5.3-rc7

From: Bandan Das
Date: Mon Sep 09 2019 - 09:54:17 EST


Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> writes:

> On Sat, Sep 7, 2019 at 12:17 PM Linus Torvalds
> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>>
>> I'm really not clear on why it's a good idea to clear the LDR bits on
>> shutdown, and commit 558682b52919 ("x86/apic: Include the LDR when
>> clearing out APIC registers") just looks pointless. And now it has
>> proven to break some machines.
>>
>> So why wouldn't we just revert it?
>
> Side note: looking around for the discussion about this patch, at
> least one version of the patch from Bandan had
>
> + if (!x2apic_enabled) {
>
> rather than
>
> + if (!x2apic_enabled()) {
>

I believe this crept up by accident when I was preparing the series, my testing
was with x2apic_enabled() but I didn't test CPU hotplug - only the kdump path
with 32 bit guest. In hindsight, I should have been more careful with testing,
sorry about that.

Bandan

> which meant that whatever Bandan tested at that point was actually a
> complete no-op, since "!x2apic_enabled" is never true (it tests a
> function pointer against NULL, which it won't be).
>
> Then that was fixed by the time it hit -tip (and eventually my tree),
> but it kind of shows how the patch history of this is all
> questionable. Further strengthened by a quote from that discussion:
>
> "this is really a KVM bug but it doesn't hurt to clear out the LDR in
> the guest and then, it wouldn't need a hypervisor fix"
>
> and clearly it *does* hurt to clear the LDR in the guest, making the
> whole thinking behind the patch wrong and broken. The kernel clearly
> _does_ depend on LDR having the right contents.
>
> Now, I still suspect the boot problem then comes from our
> cpu0_logical_apicid use mentioned in that previous email, but at this
> point I think the proper fix is "revert for now, and we can look at
> this as a cleanup with the cpu0_logical_apicid thing for 5.4 instead".
>
> Hmm?
>
> Linus