Re: [PATCH v3 07/21] x86/resctrl: Create mba_sc configuration in the rdt_domain

From: Reinette Chatre
Date: Mon Apr 04 2022 - 17:27:56 EST


Hi James,

On 4/4/2022 9:35 AM, James Morse wrote:
> Hi Reinette,
>
> On 4/1/22 23:54, Reinette Chatre wrote:
>> On 3/30/2022 9:43 AM, James Morse wrote:
>>> On 16/03/2022 21:50, Reinette Chatre wrote:
>>>> On 2/17/2022 10:20 AM, James Morse wrote:
>>>>> To support resctrl's MBA software controller, the architecture must provide
>>>>> a second configuration array to hold the mbps_val[] from user-space.
>>>>>
>>>>> This complicates the interface between the architecture specific code and
>>>>> the filesystem portions of resctrl that will move to /fs/, to allow
>>>>> multiple architectures to support resctrl.
>>>>>
>>>>> Make the filesystem parts of resctrl create an array for the mba_sc
>>>>> values when is_mba_sc() is set to true. The software controller
>>>>> can be changed to use this, allowing the architecture code to only
>>>>> consider the values configured in hardware.
>
> [...]
>
>>>> Considering that no domain belonging to RDT_RESOURCE_MBA will have this array this
>>>> always ends up being a null pointer de-reference.
>>>
>>> Ugh. I'm not sure how I managed to miss that. Thanks for debugging it!
>>>
>>> That loop was added to reset the array when the filesystem is mounted, as it may hold
>>> stale values from a previous mount of the filesystem. Its currently done by
>>> reset_all_ctrls(), but that function should really belong to the architecture code.
>>>
>>> Because mbm_handle_overflow() always passes a domain from the L3 to update_mba_bw(), I
>>> think the cleanest thing to do is move the reset to a helper that always operates on the
>>> L3 array. (and leave some breadcrumbs in the comments).
>
>> I think this points to more than a need to reset the correct array on mount/unmount ... or
>> perhaps I am not understanding this correctly?
>>
>> As the analysis above shows the mbps_val array only exists for rdt_domains associated
>> with RDT_RESOURCE_L3 but yet mbps_val will contain the MB value provided by user space
>> associated with RDT_RESOURCE_MBA.
>
> I've finally got my head round what is going on here: I've muddled up whether mon_capable
> is a resource or system property. mba_sc depends on the L3 being mon_capable, but the
> configuration should be associated with MBA (wherever that is).
> (basically ignore my previous reply!)
>
> The creation of the mbps_val[] should depend on supports_mba_mbps(), which uses
> is_mbm_enabled() to check whether the L3 is mon_capable. I'll check the rid too
> to make it clear its only MBA that has this.
> The call to allocate the domain in resctrl_online_domain() should be above the mon_capable
> check. (which is still needed to avoid the work guarded by is_mbm_enabled() and friends
> running for each domain).
>
>
> Thanks,
>
> James
>
> -----------------------%<-----------------------
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index e3c90f33baf2..ad0411eb2147 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -3345,6 +3345,14 @@ int resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d)
>  
>         lockdep_assert_held(&rdtgroup_mutex);
>  
> +       if (is_mbm_enabled() && r->rid == RDT_RESOURCE_MBA) {
> +               err = mba_sc_domain_allocate(r, d);
> +               if (err) {
> +                       domain_destroy_mon_state(d);
> +                       return err;
> +               }
> +       }
> +
>         if (!r->mon_capable)
>                 return 0;
>  
> @@ -3352,12 +3360,6 @@ int resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d)
>         if (err)
>                 return err;
>  
> -       err = mba_sc_domain_allocate(r, d);
> -       if (err) {
> -               domain_destroy_mon_state(d);
> -               return err;
> -       }
> -
>         if (is_mbm_enabled()) {
>                 INIT_DELAYED_WORK(&d->mbm_over, mbm_handle_overflow);
>                 mbm_setup_overflow_handler(d, MBM_OVERFLOW_INTERVAL);
> -----------------------%<-----------------------

Thank you. Having mbps_val be part of RDT_RESOURCE_MBA makes things clear to me.

In the snippet you provide you may need to check the error path cleanup, I do not think
domain_destroy_mon_state(d) would be needed after moving the allocation earlier.

Reinette