Re: [PATCHv2] x86/mce: Avoid reading every machine check bankregister twice.

From: Borislav Petkov
Date: Thu Apr 19 2012 - 12:03:47 EST


On Wed, Apr 18, 2012 at 03:19:40PM -0700, Luck, Tony wrote:
> Reading machine check bank registers is slow. There is a trend of
> increasing the number of banks, and the number of cores. The main section
> of do_machine_check() is a serialized section where each cpu in turn
> checks every bank. Even on a little two socket SandyBridge-EP system
> that multiplies out as:
>
> 2 sockets * 8 cores * 2 hyperthreads * 20 banks = 640 MSRs
>
> We already scan the banks in parallel in mce_no_way_out() to see if there
> is a fatal error anywhere in the system. If we build a cache of VALID
> bits during this scan, we can avoid uselessly re-reading banks that have
> no data. Note that this cache is only a hint. If the valid bit is set in a
> shared bank, all cpus that share that bank will see it during the parallel
> scan, but the first to find it in the sequential scan will (usually) clear
> the bank.
>
> Signed-off-by: Tony Luck <tony.luck@xxxxxxxxx>
> ---
>
> Version 2:
> + mce_no_way_out() now scans all banks to build the full bitmap (instead of
> breaking out early if it saw a fatal issue).
> + changed name of bitmap from "hint" to "valid_banks"

Looks good.

Acked-by: Borislav Petkov <borislav.petkov@xxxxxxx>

Thanks.

--
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/