Re: [PATCH v1 05/31] x86/resctrl: Remove rdtgroup from update_cpu_closid_rmid()

From: Dave Martin
Date: Tue Apr 16 2024 - 12:17:04 EST


On Mon, Apr 15, 2024 at 10:47:55AM -0700, Reinette Chatre wrote:
> Hi Dave,
>
> On 4/12/2024 9:12 AM, Dave Martin wrote:
> > On Mon, Apr 08, 2024 at 08:16:08PM -0700, Reinette Chatre wrote:
> >> Hi James,
> >>
> >> On 3/21/2024 9:50 AM, James Morse wrote:
> >>> update_cpu_closid_rmid() takes a struct rdtgroup as an argument, which
> >>> it uses to update the local CPUs default pqr values. This is a problem
> >>> once the resctrl parts move out to /fs/, as the arch code cannot
> >>> poke around inside struct rdtgroup.
> >>>
> >>> Rename update_cpu_closid_rmid() as resctrl_arch_sync_cpus_defaults()
> >>> to be used as the target of an IPI, and pass the effective CLOSID
> >>> and RMID in a new struct.
> >>>
> >>> Signed-off-by: James Morse <james.morse@xxxxxxx>
> >>> ---
> >>> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 19 +++++++++++++++----
> >>> include/linux/resctrl.h | 11 +++++++++++
> >>> 2 files changed, 26 insertions(+), 4 deletions(-)
> >>>
> >>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> >>> index 5d2c1ce5b6b1..18f097fce51e 100644
> >>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> >>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> >>> @@ -341,13 +341,13 @@ static int rdtgroup_cpus_show(struct kernfs_open_file *of,
> >>> * from update_closid_rmid() is protected against __switch_to() because
> >>> * preemption is disabled.
> >>> */
> >>> -static void update_cpu_closid_rmid(void *info)
> >>> +void resctrl_arch_sync_cpu_defaults(void *info)
> >>> {
> >>> - struct rdtgroup *r = info;
> >>> + struct resctrl_cpu_sync *r = info;
> >>>
> >>> if (r) {
> >>> this_cpu_write(pqr_state.default_closid, r->closid);
> >>> - this_cpu_write(pqr_state.default_rmid, r->mon.rmid);
> >>> + this_cpu_write(pqr_state.default_rmid, r->rmid);
> >>> }
> >>>
> >>> /*
> >>> @@ -362,11 +362,22 @@ static void update_cpu_closid_rmid(void *info)
> >>> * Update the PGR_ASSOC MSR on all cpus in @cpu_mask,
> >>> *
> >>> * Per task closids/rmids must have been set up before calling this function.
> >>> + * @r may be NULL.
> >>> */
> >>> static void
> >>> update_closid_rmid(const struct cpumask *cpu_mask, struct rdtgroup *r)
> >>> {
> >>> - on_each_cpu_mask(cpu_mask, update_cpu_closid_rmid, r, 1);
> >>> + struct resctrl_cpu_sync defaults;
> >>> + struct resctrl_cpu_sync *defaults_p = NULL;
> >>
> >> Please maintain reverse fir order.
> >
> > Or, more tersely as follows?
> >
> > struct resctrl_cpu_sync defaults, *defaults_p = NULL;
>
> Sure.

[*]

> >
> > "Reverse fir order" seems to be documented as a preference rather than a
> > rule.
>
> This does not seem to be a place that warrants an exception to this
> preference. Note how this function is not consistent with any other
> in the file.

Ack (just bikeshedding here TBH).

>
> > The declarations can be swapped, but defaults_p is in some sense a weak
> > pointer to defaults, so it feels a bit strange to declare them backwards.
> >
> > Alternatively, could we rename defaults_p to p? Given the size of this
> > function I don't think that impacts clarity.

[...]

> >
> > I'll wait for your opinion on this.
> >
> >
>
> Do you imply that this would maintain the order in this patch? It does
> not look to me that it would but I may be looking wrong.

I'm not sure without looking again, but since this discussion is not a
good use of your time I'll just go ahead and implement the change at
[*] above, while restoring referse FIR order, if that is good for you.

>
> sidenote: the "on_each_cpu_mask()" in update_closid_rmid() can be on
> one line.

I guess that might have been split to stick to the 80-char limit.

Due the the small size of this function, shall I just rename defaults_p to p?
Alternatively, there are already a few non-printk lines over 80 chars, so
maybe we can tolerate one more here?

>
> ..
>
> >>> + * struct resctrl_cpu_sync, or NULL.
> >>> + */
> >>
> >> Updating the CPU's defaults is not the primary goal of this function and because
> >> of that I do not think this should be the focus with the main goal (updating
> >> RMID and CLOSID on CPU) ignored. Specifically, this function only updates
> >> the defaults if *info is set but it _always_ ensures CPU is running with
> >> appropriate CLOSID/RMID (which may or may not be from a CPU default).
> >>
> >> I think resctrl_arch_sync_cpu_closid_rmid() may be more appropriate
> >> and the comment needs to elaborate what the function does.
> >>
> >>> +void resctrl_arch_sync_cpu_defaults(void *info);
> >
> > That seems reasonable, and follows the original naming and what the
> > code does:
> >
> > What about:
> >
> > /**
> > * resctrl_arch_sync_cpu_defaults() - Refresh the CPU's CLOSID and RMID.
> > * Call via IPI.
>
> Did you intend to change function name?

Er, yes, I meant to use your suggestion here, so:
resctrl_arch_sync_cpu_closid_rmid().

Also, Babu Moger's suggestion to rename struct resctrl_cpu_sync
to resctrl_cpu_defaults seems good, since that accurately describes what
is specified in the struct (and what is *not* specified if NULL is
passed).

>
> How about "Refresh the CPU's ..." -> "Refresh this CPU's ..." I think it
> makes it more obvious how this function is called.

Agreed.

>
> > * @info: If non-NULL, a pointer to a struct resctrl_cpu_sync specifying
> > * the new CLOSID and RMID for tasks in the default resctrl ctrl
> > * and mon group when running on this CPU. If NULL, the default
> > * CLOSID and RMID are not changed.
>
> "If NULL, this CPU is not re-assigned to a different group." ?

Agreed.

> > *
> > * This is how reassignment of CPUs and/or tasks to different resctrl groups
> > * is propagated when requested by the resctrl fs core code.
>
> Could you please use imperative tone here? For example, "Propagates reassignment
> of CPUs and/or tasks to different resctrl groups."

Yes, that's better (and shorter).

>
> > *
> > * This function should typically record the per-cpu defaults specified by
>
> "should" sounds like there may be cases when this is not done? Maybe just
> "Records the per-CPU defaults specified ..."

I didn't want to pre-judge what implementation-specific cruft the arch
code needs here, so I was intentionally vague. But the arch would need
to put the CPU defaults into effect somehow or other, so yes, I think
your text is better here.

I'll make a note of those changes.

[...]

Cheers
---Dave