Re: [PATCH] perf/x86/uncore: Correct the number of CHAs on EMR

From: Liang, Kan
Date: Tue Sep 05 2023 - 12:27:16 EST




On 2023-09-04 3:10 p.m., Ingo Molnar wrote:
>
> * Liang, Kan <kan.liang@xxxxxxxxxxxxxxx> wrote:
>
>>
>>
>> On 2023-09-03 4:40 a.m., Ingo Molnar wrote:
>>>
>>> * kan.liang@xxxxxxxxxxxxxxx <kan.liang@xxxxxxxxxxxxxxx> wrote:
>>>
>>>> From: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
>>>>
>>>> The MSR UNC_CBO_CONFIG, which was used to detect the number of CHAs on
>>>> SPR, is broken on EMR XCC. It always returns 0.
>>>>
>>>> Roll back to the discovery method, which can give the correct number for
>>>> this case.
>>>>
>>>> Fixes: 38776cc45eb7 ("perf/x86/uncore: Correct the number of CHAs on SPR")
>>>> Reported-by: Stephane Eranian <eranian@xxxxxxxxxx>
>>>> Reported-by: Yunying Sun <yunying.sun@xxxxxxxxx>
>>>> Tested-by: Yunying Sun <yunying.sun@xxxxxxxxx>
>>>> Signed-off-by: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
>>>> ---
>>>> arch/x86/events/intel/uncore_snbep.c | 4 +++-
>>>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
>>>> index d49e90dc04a4..c41d7d46481c 100644
>>>> --- a/arch/x86/events/intel/uncore_snbep.c
>>>> +++ b/arch/x86/events/intel/uncore_snbep.c
>>>> @@ -6475,7 +6475,9 @@ void spr_uncore_cpu_init(void)
>>>> type = uncore_find_type_by_id(uncore_msr_uncores, UNCORE_SPR_CHA);
>>>> if (type) {
>>>> rdmsrl(SPR_MSR_UNC_CBO_CONFIG, num_cbo);
>>>> - type->num_boxes = num_cbo;
>>>> + /* The MSR doesn't work on the EMR XCC. Roll back to the discovery method. */
>>>> + if (num_cbo)
>>>> + type->num_boxes = num_cbo;
>>>
>>> So in the zero case we don't write type->num_boxes and leave it as-is.
>>>
>>> How does this fall back to the discovery method, is the existing (default?)
>>> value of type->num_boxes some special value?
>>>
>>
>> Starts from SPR, the basic uncore PMON information are retrieved from
>> the discovery table (resides in an MMIO space populated by BIOS.). It is
>> called the discovery method. The existing value of the type->num_boxes
>> is from the discovery table.
>>
>> On some SPR variants, there is a firmware bug. So the value from the
>> discovery table is incorrect. We use the value from
>> SPR_MSR_UNC_CBO_CONFIG to replace the one from the discovery table.
>> 38776cc45eb7 ("perf/x86/uncore: Correct the number of CHAs on SPR")
>>
>> Unfortunately, the SPR_MSR_UNC_CBO_CONFIG isn't available for the EMR
>> XCC (It works well for other EMR variants). But the above firmware bug
>> doesn't impact the EMR XCC. So this patch NOT lets the value from the
>> SPR_MSR_UNC_CBO_CONFIG replace the existing value from the discovery table.
>
> Thanks - the comment & changelog should probably reflect this background.
>

I will update the comment & changelog and send a V2.

Thanks,
Kan