Re: [PATCH v3 07/21] x86/resctrl: Create mba_sc configuration in the rdt_domain

From: James Morse
Date: Mon Apr 04 2022 - 18:09:10 EST


Hi Reinette,

On 4/1/22 23:54, Reinette Chatre wrote:
On 3/30/2022 9:43 AM, James Morse wrote:
On 16/03/2022 21:50, Reinette Chatre wrote:
On 2/17/2022 10:20 AM, James Morse wrote:
To support resctrl's MBA software controller, the architecture must provide
a second configuration array to hold the mbps_val[] from user-space.

This complicates the interface between the architecture specific code and
the filesystem portions of resctrl that will move to /fs/, to allow
multiple architectures to support resctrl.

Make the filesystem parts of resctrl create an array for the mba_sc
values when is_mba_sc() is set to true. The software controller
can be changed to use this, allowing the architecture code to only
consider the values configured in hardware.

[...]

Considering that no domain belonging to RDT_RESOURCE_MBA will have this array this
always ends up being a null pointer de-reference.

Ugh. I'm not sure how I managed to miss that. Thanks for debugging it!

That loop was added to reset the array when the filesystem is mounted, as it may hold
stale values from a previous mount of the filesystem. Its currently done by
reset_all_ctrls(), but that function should really belong to the architecture code.

Because mbm_handle_overflow() always passes a domain from the L3 to update_mba_bw(), I
think the cleanest thing to do is move the reset to a helper that always operates on the
L3 array. (and leave some breadcrumbs in the comments).

I think this points to more than a need to reset the correct array on mount/unmount ... or
perhaps I am not understanding this correctly?

As the analysis above shows the mbps_val array only exists for rdt_domains associated
with RDT_RESOURCE_L3 but yet mbps_val will contain the MB value provided by user space
associated with RDT_RESOURCE_MBA.

I've finally got my head round what is going on here: I've muddled up whether mon_capable
is a resource or system property. mba_sc depends on the L3 being mon_capable, but the
configuration should be associated with MBA (wherever that is).
(basically ignore my previous reply!)

The creation of the mbps_val[] should depend on supports_mba_mbps(), which uses
is_mbm_enabled() to check whether the L3 is mon_capable. I'll check the rid too
to make it clear its only MBA that has this.
The call to allocate the domain in resctrl_online_domain() should be above the mon_capable
check. (which is still needed to avoid the work guarded by is_mbm_enabled() and friends
running for each domain).


Thanks,

James

-----------------------%<-----------------------
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index e3c90f33baf2..ad0411eb2147 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -3345,6 +3345,14 @@ int resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d)
lockdep_assert_held(&rdtgroup_mutex);
+ if (is_mbm_enabled() && r->rid == RDT_RESOURCE_MBA) {
+ err = mba_sc_domain_allocate(r, d);
+ if (err) {
+ domain_destroy_mon_state(d);
+ return err;
+ }
+ }
+
if (!r->mon_capable)
return 0;
@@ -3352,12 +3360,6 @@ int resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d)
if (err)
return err;
- err = mba_sc_domain_allocate(r, d);
- if (err) {
- domain_destroy_mon_state(d);
- return err;
- }
-
if (is_mbm_enabled()) {
INIT_DELAYED_WORK(&d->mbm_over, mbm_handle_overflow);
mbm_setup_overflow_handler(d, MBM_OVERFLOW_INTERVAL);
-----------------------%<-----------------------