Re: [tip:x86/apic] x86/acpi: Introduce persistent storage for cpuid <-> apicid mapping

From: Yinghai Lu
Date: Tue Oct 04 2016 - 02:02:08 EST


On Thu, Sep 22, 2016 at 12:10 PM, tip-bot for Gu Zheng <tipbot@xxxxxxxxx> wrote:
>
> x86/acpi: Introduce persistent storage for cpuid <-> apicid mapping
>
> The whole patch-set aims at making cpuid <-> nodeid mapping persistent. So that,
> when node online/offline happens, cache based on cpuid <-> nodeid mapping such as
> wq_numa_possible_cpumask will not cause any problem.
> It contains 4 steps:
> 1. Enable apic registeration flow to handle both enabled and disabled cpus.
> 2. Introduce a new array storing all possible cpuid <-> apicid mapping.
> 3. Enable _MAT and MADT relative apis to return non-present or disabled cpus' apicid.
> 4. Establish all possible cpuid <-> nodeid mapping.
>
> This patch finishes step 2.
>
> In this patch, we introduce a new static array named cpuid_to_apicid[],
> which is large enough to store info for all possible cpus.
>
> And then, we modify the cpuid calculation. In generic_processor_info(),
> it simply finds the next unused cpuid. And it is also why the cpuid <-> nodeid
> mapping changes with node hotplug.
>
> After this patch, we find the next unused cpuid, map it to an apicid,
> and store the mapping in cpuid_to_apicid[], so that cpuid <-> apicid
> mapping will be persistent.
>
> And finally we will use this array to make cpuid <-> nodeid persistent.
>
> cpuid <-> apicid mapping is established at local apic registeration time.
> But non-present or disabled cpus are ignored.
>
> In this patch, we establish all possible cpuid <-> apicid mapping when
> registering local apic.

Hi,

This one cause one regression on 8 sockets system: MLC from intel does not run
anymore.

the root cause is : cpu index used to be 0-447.
with this patch, cpu index change to 0, 2-448.

The MADT from system is like:
[ 42.107902] ACPI: LAPIC (acpi_id[0xff] lapic_id[0xff] disabled)
[ 42.120125] ACPI: LAPIC (acpi_id[0xff] lapic_id[0xff] disabled)
[ 42.132361] ACPI: LAPIC (acpi_id[0xff] lapic_id[0xff] disabled)
[ 42.144598] ACPI: LAPIC (acpi_id[0xff] lapic_id[0xff] disabled)
[ 42.156836] ACPI: LAPIC (acpi_id[0xff] lapic_id[0xff] disabled)
...
[ 47.552852] ACPI: LAPIC (acpi_id[0xff] lapic_id[0xff] disabled)
[ 47.565088] ACPI: LAPIC (acpi_id[0xff] lapic_id[0xff] disabled)
[ 47.577322] ACPI: LAPIC (acpi_id[0xff] lapic_id[0xff] disabled)
[ 47.589561] ACPI: X2APIC (uid[0x00] apic_id[0x00] enabled)
[ 47.600899] ACPI: X2APIC (uid[0x02] apic_id[0x02] enabled)
[ 47.612234] ACPI: X2APIC (uid[0x04] apic_id[0x04] enabled)
...

init_cpu_node become:
[ 55.477160] init_cpu_to_node:
[ 55.483280] cpu 0 -> apicid 0x0 -> node 0
[ 55.491558] cpu 1 -> apicid 0xff -> node 1
[ 55.500017] cpu 2 -> apicid 0x2 -> node 0
[ 55.508296] cpu 3 -> apicid 0x4 -> node 0
[ 55.516575] cpu 4 -> apicid 0x6 -> node 0
...

looks like problem is

acpi_parse_lapic==>acpi_register_lapic==>__generic_processor_info==>allocate_logical_cpuid

it will take lapic_id[0xff] take cpu index 1.

Then will have not /dev/cpu/1/msr, that will make the MLC not happy.

Following change could workaround the problem at this point.

Index: linux-2.6/arch/x86/kernel/acpi/boot.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/acpi/boot.c
+++ linux-2.6/arch/x86/kernel/acpi/boot.c
@@ -163,10 +163,11 @@ static int __init acpi_parse_madt(struct
* @id: local apic id to register
* @acpiid: ACPI id to register
* @enabled: this cpu is enabled or not
+ * @disabled_id: not used apic id
*
* Returns the logic cpu number which maps to the local apic
*/
-static int acpi_register_lapic(int id, u32 acpiid, u8 enabled)
+static int acpi_register_lapic(int id, u32 acpiid, u8 enabled, int disabled_id)
{
unsigned int ver = 0;
int cpu;
@@ -176,6 +177,11 @@ static int acpi_register_lapic(int id, u
return -EINVAL;
}

+ if (!enabled && (id == disabled_id)) {
+ ++disabled_cpus;
+ return -EINVAL;
+ }
+
if (boot_cpu_physical_apicid != -1U)
ver = boot_cpu_apic_version;

@@ -213,7 +219,7 @@ acpi_parse_x2apic(struct acpi_subtable_h
if (!apic->apic_id_valid(apic_id) && enabled)
printk(KERN_WARNING PREFIX "x2apic entry ignored\n");
else
- acpi_register_lapic(apic_id, processor->uid, enabled);
+ acpi_register_lapic(apic_id, processor->uid, enabled, -1);
#else
printk(KERN_WARNING PREFIX "x2apic entry ignored\n");
#endif
@@ -242,7 +248,7 @@ acpi_parse_lapic(struct acpi_subtable_he
*/
acpi_register_lapic(processor->id, /* APIC ID */
processor->processor_id, /* ACPI ID */
- processor->lapic_flags & ACPI_MADT_ENABLED);
+ processor->lapic_flags & ACPI_MADT_ENABLED, 0xff);

return 0;
}
@@ -261,7 +267,7 @@ acpi_parse_sapic(struct acpi_subtable_he

acpi_register_lapic((processor->id << 8) | processor->eid,/* APIC ID */
processor->processor_id, /* ACPI ID */
- processor->lapic_flags & ACPI_MADT_ENABLED);
+ processor->lapic_flags & ACPI_MADT_ENABLED, -1);

return 0;
}
@@ -725,7 +731,7 @@ int acpi_map_cpu(acpi_handle handle, phy
{
int cpu;

- cpu = acpi_register_lapic(physid, U32_MAX, ACPI_MADT_ENABLED);
+ cpu = acpi_register_lapic(physid, U32_MAX, ACPI_MADT_ENABLED, -1);
if (cpu < 0) {
pr_info(PREFIX "Unable to map lapic to logical cpu number\n");
return cpu;