Re: [PATCH v2 1/2] mce: acpi/apei: Honour Firmware First for MCAbanks listed in APEI HEST CMC

From: Naveen N. Rao
Date: Fri Jun 21 2013 - 03:47:15 EST


On 06/21/2013 01:04 PM, Borislav Petkov wrote:
On Fri, Jun 21, 2013 at 02:52:25AM +0530, Naveen N. Rao wrote:
Exactly, but mce_poll_banks also doesn't have bits set for banks on
which CMCI is enabled.

Let's say we have a cpu with 2 banks (not shared), none of which work
in FF mode. Both these banks support CMCI, so mce_poll_banks won't
have these bits set.

On cpu offline, we call cmci_clear() which disables CMCI on these two
banks before offlining it. When this cpu is brought online again, we
call cmci_discover() which sees that mce_poll_banks doesn't have these
two banks enabled and will skip enabling CMCI thinking these are in
FF.

Hmm, mce_intel has yet another bitfield - mce_banks_owned. (Btw, this is
why I have a problem with adding yet another bitfield).

The way I understand it is, if a bit is set in the owned bitfield, those
banks belong to CMCI and are not polled.

Now, can we use both mce_banks_owned and mce_poll_banks? If a bit in
both bifields is cleared, the corresponding bank is not polled *and* is
not owned by CMCI => it is in FF mode.

Makes sense?


Yes, but I'm afraid this won't work either - mce_banks_owned is cleared during cpu offline. This is necessary since a cmci rediscover is triggered on cpu offline, so that if this bank is shared across cores, a different cpu can claim ownership of this bank.

The difference between the new bitfield and the existing bitfields is that the new one is not per-cpu. This is a global list of banks across cpus that we do not want enabled.


Thanks,
Naveen

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/