Re: [PATCH 2/5] x86, cacheinfo: Turn off L3 cache index disable feature in virtualized environments

From: Jaswinder Singh Rajput
Date: Mon May 17 2010 - 15:05:45 EST


Hello Frank and Borislav,

On Sun, May 16, 2010 at 12:59 AM, Jaswinder Singh Rajput
<jaswinderlinux@xxxxxxxxx> wrote:
> Hello Frank and Borislav,
>
> On Thu, Apr 22, 2010 at 7:36 PM, Borislav Petkov <bp@xxxxxxxxx> wrote:
>> From: Frank Arnold <frank.arnold@xxxxxxx>
>>
>> When running a quest kernel on xen we get:
>>
>> BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
>> IP: [<ffffffff8142f2fb>] cpuid4_cache_lookup_regs+0x2ca/0x3df
>> PGD 0
>> Oops: 0000 [#1] SMP
>> last sysfs file:
>> CPU 0
>> Modules linked in:
>>
>> Pid: 0, comm: swapper Tainted: G        W  2.6.34-rc3 #1 /HVM domU
>> RIP: 0010:[<ffffffff8142f2fb>]  [<ffffffff8142f2fb>] cpuid4_cache_lookup_regs+0x
>> 2ca/0x3df
>> RSP: 0018:ffff880002203e08  EFLAGS: 00010046
>> RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000060
>> RDX: 0000000000000000 RSI: 0000000000000040 RDI: 0000000000000000
>> RBP: ffff880002203ed8 R08: 00000000000017c0 R09: ffff880002203e38
>> R10: ffff8800023d5d40 R11: ffffffff81a01e28 R12: ffff880187e6f5c0
>> R13: ffff880002203e34 R14: ffff880002203e58 R15: ffff880002203e68
>> FS:  0000000000000000(0000) GS:ffff880002200000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>> CR2: 0000000000000038 CR3: 0000000001a3c000 CR4: 00000000000006f0
>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> Process swapper (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a44020)
>> Stack:
>>  ffffffff810d7ecb ffff880002203e20 ffffffff81059140 ffff880002203e30
>> <0> ffffffff810d7ec9 0000000002203e40 000000000050d140 ffff880002203e70
>> <0> 0000000002008140 0000000000000086 ffff880040020140 ffffffff81068b8b
>> Call Trace:
>>  <IRQ>
>>  [<ffffffff810d7ecb>] ? sync_supers_timer_fn+0x0/0x1c
>>  [<ffffffff81059140>] ? mod_timer+0x23/0x25
>>  [<ffffffff810d7ec9>] ? arm_supers_timer+0x34/0x36
>>  [<ffffffff81068b8b>] ? hrtimer_get_next_event+0xa7/0xc3
>>  [<ffffffff81058e85>] ? get_next_timer_interrupt+0x19a/0x20d
>>  [<ffffffff8142fa23>] get_cpu_leaves+0x5c/0x232
>>  [<ffffffff8106a7b1>] ? sched_clock_local+0x1c/0x82
>>  [<ffffffff8106a9a0>] ? sched_clock_tick+0x75/0x7a
>>  [<ffffffff8107748c>] generic_smp_call_function_single_interrupt+0xae/0xd0
>>  [<ffffffff8101f6ef>] smp_call_function_single_interrupt+0x18/0x27
>>  [<ffffffff8100a773>] call_function_single_interrupt+0x13/0x20
>>  <EOI>
>>  [<ffffffff8143c468>] ? notifier_call_chain+0x14/0x63
>>  [<ffffffff810295c6>] ? native_safe_halt+0xc/0xd
>>  [<ffffffff810114eb>] ? default_idle+0x36/0x53
>>  [<ffffffff81008c22>] cpu_idle+0xaa/0xe4
>>  [<ffffffff81423a9a>] rest_init+0x7e/0x80
>>  [<ffffffff81b10dd2>] start_kernel+0x40e/0x419
>>  [<ffffffff81b102c8>] x86_64_start_reservations+0xb3/0xb7
>>  [<ffffffff81b103c4>] x86_64_start_kernel+0xf8/0x107
>> Code: 14 d5 40 ff ae 81 8b 14 02 31 c0 3b 15 47 1c 8b 00 7d 0e 48 8b 05 36 1c 8b
>>  00 48 63 d2 48 8b 04 d0 c7 85 5c ff ff ff 00 00 00 00 <8b> 70 38 48 8d 8d 5c ff
>>  ff ff 48 8b 78 10 ba c4 01 00 00 e8 eb
>> RIP  [<ffffffff8142f2fb>] cpuid4_cache_lookup_regs+0x2ca/0x3df
>>  RSP <ffff880002203e08>
>> CR2: 0000000000000038
>> ---[ end trace a7919e7f17c0a726 ]---
>>
>> The L3 cache index disable feature of AMD CPUs has to be disabled if the
>> kernel is running as guest on top of a hypervisor because northbridge
>> devices are not available to the guest. Currently, this fixes a boot
>> crash on top of Xen. In the future this will become an issue on KVM as
>> well.
>>
>> Check if northbridge devices are present and do not enable the feature
>> if there are none.
>>
>> Signed-off-by: Frank Arnold <frank.arnold@xxxxxxx>
>> Acked-by: Borislav Petkov <borislav.petkov@xxxxxxx>
>> ---
>>  arch/x86/kernel/cpu/intel_cacheinfo.c |    4 ++++
>>  1 files changed, 4 insertions(+), 0 deletions(-)
>>
>> diff --git a/arch/x86/kernel/cpu/intel_cacheinfo.c b/arch/x86/kernel/cpu/intel_cacheinfo.c
>> index acfb083..5ab14c8 100644
>> --- a/arch/x86/kernel/cpu/intel_cacheinfo.c
>> +++ b/arch/x86/kernel/cpu/intel_cacheinfo.c
>> @@ -344,6 +344,10 @@ amd_check_l3_disable(int index, struct _cpuid4_info_regs *this_leaf)
>>             boot_cpu_data.x86_mask < 0x1)
>>                        return;
>>
>> +       /* not in virtualized environments */
>> +       if (num_k8_northbridges == 0)
>> +               return;
>> +
>>        this_leaf->can_disable = true;
>>        this_leaf->l3_indices  = amd_calc_l3_indices();
>>  }
>
> I think :
>
> 1. checking num_k8_northbridges is not sufficient as
> node_to_k8_nb_misc() of amd_calc_l3_indices() is almost doing the same
> thing.
>
> 2. num_k8_northbridges is CONFIG_K8_NB specific
>
> Can you try this :
>
> If you return from amd_calc_l3_indices() if dev == NULL.
>

In the amd_calc_l3_indices(), we are using dev without testing for
NULL, it seems like a BUG to me.

In amd_check_l3_disable(), we are doing following checks :

1. boot_cpu_data.x86 == 0x11
2. errata #382 and #388
3. num_k8_northbridges == 0

I am wondering do we really need all three checking or it is the
workaround of the amd_calc_l3_indices() BUG.

Unfortunately, I do not have access to hardware to test the changes.
Do you mind testing this where I remove all the 3 checking conditions,
can you also test individual case if this does not work.

Thanks.


diff --git a/arch/x86/kernel/cpu/intel_cacheinfo.c
b/arch/x86/kernel/cpu/intel_cacheinfo.c
index 95962a9..514692c 100644
--- a/arch/x86/kernel/cpu/intel_cacheinfo.c
+++ b/arch/x86/kernel/cpu/intel_cacheinfo.c
@@ -302,17 +302,20 @@ struct _cache_attr {
};

#ifdef CONFIG_CPU_SUP_AMD
-static unsigned int __cpuinit amd_calc_l3_indices(void)
+static int __cpuinit amd_calc_l3_indices(void)
{
/*
* We're called over smp_call_function_single() and therefore
* are on the correct cpu.
*/
+ unsigned int sc0, sc1, sc2, sc3;
+ u32 val = 0;
int cpu = smp_processor_id();
int node = cpu_to_node(cpu);
struct pci_dev *dev = node_to_k8_nb_misc(node);
- unsigned int sc0, sc1, sc2, sc3;
- u32 val = 0;
+
+ if (!dev)
+ return -1;

pci_read_config_dword(dev, 0x1C4, &val);

@@ -328,24 +331,17 @@ static unsigned int __cpuinit amd_calc_l3_indices(void)
static void __cpuinit
amd_check_l3_disable(int index, struct _cpuid4_info_regs *this_leaf)
{
- if (index < 3)
- return;
+ int indices;

- if (boot_cpu_data.x86 == 0x11)
- return;
-
- /* see errata #382 and #388 */
- if ((boot_cpu_data.x86 == 0x10) &&
- ((boot_cpu_data.x86_model < 0x8) ||
- (boot_cpu_data.x86_mask < 0x1)))
+ if (index < 3)
return;

- /* not in virtualized environments */
- if (num_k8_northbridges == 0)
+ indices = amd_calc_l3_indices();
+ if (indices < 0)
return;

this_leaf->can_disable = true;
- this_leaf->l3_indices = amd_calc_l3_indices();
+ this_leaf->l3_indices = indices;
}

static ssize_t show_cache_disable(struct _cpuid4_info *this_leaf, char *buf,

Attachment: indices.patch
Description: Binary data