Re: [patch 0/5] x86: mce: Bugfixes, cleanups and a new CMCI pollversion

From: Chen Gong
Date: Mon Jun 11 2012 - 01:46:24 EST


ä 2012/6/8 15:49, Thomas Gleixner åé:
On Thu, 7 Jun 2012, Chen Gong wrote:

But during the CPU online/offline test I found an issue. After *STORM*
mode is entered, it can't come back from *STORM* mode to normal
interrupt mode. At least there exists such an issue: when *STORM* is
entered, in the meanwhile, one CPU is offline during this period,
which means *cmci_storm_on_cpus* can't decrease to 0 because there
is one bit stuck on this offlined CPU. So we should detect such
situation and decrease on *cmci_storm_on_cpus* at proper time.

Yes, we need to reset the storm state as well I think.

BTW, even I online the *CPU* in above situation, the normal CMCI
still doesn't come back, strange.

That's weird.

Here is the reason. When CPU offline and then online, the CMCI is
reenabled, which means it opens a backdoor to enable *CMCI storm detect*
again. The result is the counter cmci_sotrm_cnt is increased again
before it decreases to zero. At last, this counter will never be back
to zero, irrelevant to CPU online. Obviously, we must stop CMCI enabling
again before CMCI storm ends, even during CPU online/offline. Besides
this, we still need to remove the count if one CPU is offlined. Like
below tmp patch I wrote (not guarantee it is OK :-))

diff --git a/arch/x86/kernel/cpu/mcheck/mce-internal.h b/arch/x86/kernel/cpu/mcheck/mce-internal.h
index 2cd73ce..6a05c1d 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-internal.h
+++ b/arch/x86/kernel/cpu/mcheck/mce-internal.h
@@ -31,9 +31,11 @@ extern struct mce_bank *mce_banks;
#ifdef CONFIG_X86_MCE_INTEL
unsigned long mce_intel_adjust_timer(unsigned long interval);
void mce_intel_cmci_poll(void);
+void mce_intel_hcpu_update(unsigned long cpu);
#else
# define mce_intel_adjust_timer mce_adjust_timer_default
static inline void mce_intel_cmci_poll(void) { }
+static inline void mce_intel_hcpu_update(unsigned long cpu) { }
#endif

void mce_timer_kick(unsigned long interval);
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index e3f8b94..5e22d99 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -2306,6 +2306,7 @@ mce_cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu)
if (threshold_cpu_callback)
threshold_cpu_callback(action, cpu);
mce_device_remove(cpu);
+ mce_intel_hcpu_update(cpu);
break;
case CPU_DOWN_PREPARE:
smp_call_function_single(cpu, mce_disable_cpu, &action, 1);
diff --git a/arch/x86/kernel/cpu/mcheck/mce_intel.c b/arch/x86/kernel/cpu/mcheck/mce_intel.c
index 92d8b5c..0051b2d 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_intel.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_intel.c
@@ -40,6 +40,7 @@ static DEFINE_RAW_SPINLOCK(cmci_discover_lock);
static DEFINE_PER_CPU(unsigned long, cmci_time_stamp);
static DEFINE_PER_CPU(unsigned int, cmci_storm_cnt);
static DEFINE_PER_CPU(unsigned int, cmci_storm_state);
+static DEFINE_PER_CPU(unsigned long, cmci_storm_hcpu_status);

enum {
CMCI_STORM_NONE,
@@ -47,6 +48,11 @@ enum {
CMCI_STORM_SUBSIDED,
};

+enum {
+ CMCI_STORM_HCPU_NONE,
+ CMCI_STORM_HCPU_ACTIVE,
+};
+
static atomic_t cmci_storm_on_cpus;

static int cmci_supported(int *banks)
@@ -77,6 +83,17 @@ void mce_intel_cmci_poll(void)
machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_banks_owned));
}

+void mce_intel_hcpu_update(unsigned long cpu)
+{
+ unsigned long *status = &per_cpu(cmci_storm_hcpu_status, cpu);
+
+ if (*status == CMCI_STORM_HCPU_ACTIVE) {
+ *status = CMCI_STORM_HCPU_NONE;
+ atomic_dec(&cmci_storm_on_cpus);
+ per_cpu(cmci_storm_state, cpu) = CMCI_STORM_NONE;
+ }
+}
+
unsigned long mce_intel_adjust_timer(unsigned long interval)
{
if (interval < CMCI_POLL_INTERVAL)
@@ -132,6 +149,7 @@ static bool cmci_storm_detect(void)

cmci_clear();
__this_cpu_write(cmci_storm_state, CMCI_STORM_ACTIVE);
+ __this_cpu_write(cmci_storm_hcpu_status, CMCI_STORM_HCPU_ACTIVE);
atomic_inc(&cmci_storm_on_cpus);
mce_timer_kick(CMCI_POLL_INTERVAL);
return true;



I still have another question: When we handle following case:
mce_cpu_callback(struct notifier_block *
mce_device_remove(cpu);
break;
case CPU_DOWN_PREPARE:
- del_timer_sync(t);
smp_call_function_single(cpu, mce_disable_cpu, &action, 1);
+ del_timer_sync(t);
break;

Where we add this timer back? I can't find it in "case CPU_ONLINE".

The timer gets added back via mcheck_cpu_init(), which is called on
the newly onlined cpu from smp_callin().

Stupid me! I must too busy that day so that I lost my head. :-(.
Thanks Tomas for your explanation!


Thanks,

tglx




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/