Re: [PATCH 10/26] x86-64, NUMA: Move apicid to numa mappinginitialization from amd_scan_nodes() to amd_numa_init()

From: Yinghai Lu
Date: Tue Feb 15 2011 - 12:54:16 EST


On Tue, Feb 15, 2011 at 9:31 AM, Cyrill Gorcunov <gorcunov@xxxxxxxxx> wrote:
> On 02/15/2011 12:36 PM, Tejun Heo wrote:
> ...
>>>
>>>   Hi Tejun, while you at it, it seems apicid_base conditional assignment
>>> is
>>> redundant here (boot_cpu_physical_apicid is unsigned int) so we might
>>> have
>>> something like
>>>
>>>        apicid_start    = boot_cpu_physical_apicid;
>>>        apicid_end      = apicid_start + cores;
>>>
>>>        for_each_node_mask(i, cpu_nodes_parsed) {
>>>                for (j = apicid_start; j<   apicid_end; j++)
>>>                        set_apicid_to_node((i<<   bits) + j, i);
>>>        }
>>
>> Right, I think the intention there was
>>
>>        if (boot_cpu_physical_apicid == -1U)
>>
>> because that's the initial value and we don't really want to index the
>> apicid nid table with -1U.  Care to send a patch?  I'm gonna have to
>> rebase anyway and can put the patch at the front.
>>
>> Thanks.
>>
>
>  Hi Tejun again :) I've looked some more and if I'm not missing something
> (Yinghai?) the code is broken in another way. We might have AMD system
> with corrupted MADT table so boot_cpu_physical_apicid remains =-1U
> and then we better BUG_ON instead of possible access of out-of-range
> __apicid_to_node array. If MADT is parsed successfully
> boot_cpu_physical_apicid
> will have correct value. So I think we rather should add something like the
> patch below. Again better Yinghai check it first so I would not _miss_ the
> point that such situation is impossible at all. (And if I'm right we need to
> check for set_apicid_to_node(apicid, ) never exceed MAX_LOCAL_APIC as well.
>
> Yinghai am I missing something?
>
> --
>    Cyrill
>
> ---
> x86, numa: amd -- Check for screwed MADT table
>
> In case if MADT table is corrupted we might end up
> with boot_cpu_physical_apicid = -1U, corebits > 0 and
> get out of __apicid_to_node array bound access. Check for
> boot_cpu_physical_apicid being not default value.
>
> Signed-off-by: Cyrill Gorcunov <gorcunov@xxxxxxxxxx>
> ---
>  arch/x86/mm/amdtopology_64.c |    7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> Index: linux-2.6.git/arch/x86/mm/amdtopology_64.c
> =====================================================================
> --- linux-2.6.git.orig/arch/x86/mm/amdtopology_64.c
> +++ linux-2.6.git/arch/x86/mm/amdtopology_64.c
> @@ -271,9 +271,14 @@ int __init amd_scan_nodes(void)
>        bits = boot_cpu_data.x86_coreid_bits;
>        cores = (1<<bits);
>        apicid_base = 0;
> +
>        /* get the APIC ID of the BSP early for systems with apicid lifting
> */
>        early_get_boot_cpu_id();
> -       if (boot_cpu_physical_apicid > 0) {
> +       if (boot_cpu_physical_apicid == -1U || ) {
> +               pr_err("BAD APIC ID: %02x, NUMA node scaning canceled\n",
> +                       boot_cpu_physical_apicid);
> +               return -1;
> +       } else if (boot_cpu_physical_apicid > 0) {
>                pr_info("BSP APIC ID: %02x\n", boot_cpu_physical_apicid);
>                apicid_base = boot_cpu_physical_apicid;
>        }

could just change

- if (boot_cpu_physical_apicid > 0) {
+ if (boot_cpu_physical_apicid != -1U) {
pr_info("BSP APIC ID: %02x\n", boot_cpu_physical_apicid);
apicid_base = boot_cpu_physical_apicid;
}

Thanks

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/