Re: [PATCH v2] x86/resctrl: Fix zero cbm for AMD in cbm_validate

From: Reinette Chatre
Date: Tue May 17 2022 - 12:49:37 EST


Hi Fenghua,

On 5/17/2022 9:33 AM, Fenghua Yu wrote:
> Hi, Eranian,
>
> On Mon, May 16, 2022 at 05:12:34PM -0700, Stephane Eranian wrote:
>> AMD supports cbm with no bits set as reflected in rdt_init_res_defs_amd() by:
> ...
>> @@ -107,6 +107,10 @@ static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r)
>> first_bit = find_first_bit(&val, cbm_len);
>> zero_bit = find_next_zero_bit(&val, cbm_len, first_bit);
>>
>> + /* no need to check bits if arch supports no bits set */
>> + if (r->cache.arch_has_empty_bitmaps && val == 0)
>> + goto done;
>> +
>> /* Are non-contiguous bitmaps allowed? */
>> if (!r->cache.arch_has_sparse_bitmaps &&
>> (find_next_bit(&val, cbm_len, zero_bit) < cbm_len)) {
>> @@ -119,7 +123,7 @@ static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r)
>> r->cache.min_cbm_bits);
>> return false;
>> }
>> -
>> +done:
>> *data = val;
>> return true;
>> }
>
> Isn't it AMD supports 0 minimal CBM bits? Then should set its min_cbm_bits as 0.
> Is the following patch a better fix? I don't have AMD machine and cannot
> test the patch.
>
> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> index 6055d05af4cc..031d77dd982d 100644
> --- a/arch/x86/kernel/cpu/resctrl/core.c
> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> @@ -909,6 +909,7 @@ static __init void rdt_init_res_defs_amd(void)
> r->cache.arch_has_sparse_bitmaps = true;
> r->cache.arch_has_empty_bitmaps = true;
> r->cache.arch_has_per_cpu_cfg = true;
> + r->cache.min_cbm_bits = 0;
> } else if (r->rid == RDT_RESOURCE_MBA) {
> hw_res->msr_base = MSR_IA32_MBA_BW_BASE;
> hw_res->msr_update = mba_wrmsr_amd;

That is actually what Stephane's V1 [1] did and I proposed that
he fixes it with (almost) what he has in V2 (I think the check
can be moved earlier before any bits are searched for).

The reasons why I proposed this change are:
- min_cbm_bits is a value that is exposed to user space and from the
time AMD was supported this has always been 1 for those systems. I
do not know how user space uses this value and unless I can be certain
making this 0 will not affect user space I would prefer not to
make such a change.

- this fix restores original behavior that was changed in the patch noted
in the Fixes link.

- this fix itself relies on math on error returns of bit checking on an empty
bitmap. I find that hides what the code does and this fix is more obvious.
You can see this feedback in my response to V1.

- a fix like the above snippet is incomplete. To be appropriate
the initialization of rdt_resources_all[] needs to be changed to
not initialize min_cbm_bits anymore and move the platform specific bits
to rdt_init_res_defs_amd() and rdt_init_res_defs_intel() respectively.


Reinette

[1] https://lore.kernel.org/lkml/20220516055055.2734840-1-eranian@xxxxxxxxxx/