Re: [PATCH 2/2] x86/mce/amd: Ensure quirks are applied in resume path as well

From: Borislav Petkov
Date: Wed Jan 09 2019 - 17:08:55 EST


On Mon, Jan 07, 2019 at 10:37:24AM +0000, S, Shirish wrote:
> This patch adds threshold quirk applicable for family 15

Same issue with "This patch" here.

> in resume path as well, since mce_amd_feature_init()
> does not have quirks applied when originating from mce_syscore_resume(),
> resulting in the below message at every successful resume:
>
> "[Firmware Bug]: cpu 0, invalid threshold interrupt offset ..."
>
> Signed-off-by: Shirish S <shirish.s@xxxxxxx>
> ---
> arch/x86/kernel/cpu/mce/amd.c | 34 ++++++++++++++++++++++++++++++++++
> 1 file changed, 34 insertions(+)
>
> diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
> index 89298c8..27cbf66 100644
> --- a/arch/x86/kernel/cpu/mce/amd.c
> +++ b/arch/x86/kernel/cpu/mce/amd.c
> @@ -545,6 +545,34 @@ prepare_threshold_block(unsigned int bank, unsigned int block, u32 addr,
> return offset;
> }
>
> +void disable_err_thresholding(struct cpuinfo_x86 *c)
> +{
> + int i;
> + u64 hwcr;
> + bool need_toggle;
> + u32 msrs[] = {
> + 0x00000413, /* MC4_MISC0 */
> + 0xc0000408, /* MC4_MISC1 */
> + };
> +
> + if (c->x86_model >= 0x10 && c->x86_model <= 0x7f) {

You can save yourself an indentation level by reversing the logic here:

if (c->x86 != 0x15)
return;

Also, I'm wondering if you simply can't do

if (c->x86_model < 0x10)
return;

The assumption being that all the models - even after 0x7f - are highly
unlikely to get MC4_MISC thresholding supported, all of a sudden. Might
wanna run it by HW guys first though.

> + rdmsrl(MSR_K7_HWCR, hwcr);
> +
> + /* McStatusWrEn has to be set */
> + need_toggle = !(hwcr & BIT(18));
> +
> + if (need_toggle)
> + wrmsrl(MSR_K7_HWCR, hwcr | BIT(18));
> +
> + /* Clear CntP bit safely */
> + for (i = 0; i < ARRAY_SIZE(msrs); i++)
> + msr_clear_bit(msrs[i], 62);
> +
> + /* restore old settings */
> + if (need_toggle)
> + wrmsrl(MSR_K7_HWCR, hwcr);
> + }

So you copied the same code from __mcheck_cpu_apply_quirks().

No.

In a first patch, you carve that CntP clearing code in a separate
function disable_err_thresholding() like you've done before.

Then, in a second patch, you call it from mce/amd.c and you move all the
family/model checks inside the function so that you have a sole

disable_err_thresholding();

calls where you need them.

You don't have to pass in struct cpuinfo_x86 *c - you can use
boot_cpu_data in the function.

> +}
> /* cpu init entry point, called from mce.c with preempt off */
> void mce_amd_feature_init(struct cpuinfo_x86 *c)
> {
> @@ -552,6 +580,12 @@ void mce_amd_feature_init(struct cpuinfo_x86 *c)
> unsigned int bank, block, cpu = smp_processor_id();
> int offset = -1;
>
> + /* Disable error thresholding bank in S3 resume path as well,

What S3 resume path? That's the CPU init path.

Also, kernel comments style is:

/*
* A sentence ending with a full-stop.
* Another sentence. ...
* More sentences. ...
*/


Good luck!

:-)

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.