Re: [PATCH V2 09/10] x86/resctrl: Pseudo-lock portions of multiple resources

From: Reinette Chatre
Date: Thu Aug 08 2019 - 16:13:49 EST


Hi Borislav,

On 8/8/2019 1:44 AM, Borislav Petkov wrote:
> On Wed, Aug 07, 2019 at 12:23:29PM -0700, Reinette Chatre wrote:
>> I do not fully understand this proposal. All those goto labels take care
>> of the the different failures that can be encountered during the
>> initialization of the pseudo-lock region. Each initialization failure is
>> associated with a goto where it jumps to the cleanup path. The
>> initialization starts with the constraining of the c-states
>> (initializing plr->pm_reqs), but if I move that I think it will not
>> reduce the goto labels, just change the order because of the other
>> initialization done (plr->size, plr->line_size, plr->cpu).
>
> Here's one possible way to do it, pasting the whole function here as it
> is easier to read it this way than an incremental diff ontop.
>
> You basically cache all attributes in local variables and assign them to
> the plr struct only on success, at the end. This way, no goto labels and
> the C-states constraining, i.e., the most expensive operation, happens
> last, only after all the other simpler checks have succeeded. And you
> don't have to call pseudo_lock_cstates_relax() prematurely, when one of
> those easier checks fail.
>
> Makes sense?

It does. This looks much better. Thank you very much.

>
> Btw, I've marked the cpu_online() check with "CPU hotplug
> lock?!?" question because I don't see you holding that lock with
> get_online_cpus()/put_online_cpus().

There is a locking order dependency between cpu_hotplug_lock and
rdtgroup_mutex (cpu_hotplug_lock before rdtgroup_mutex) that has to be
maintained. To do so in this flow you will find cpus_read_lock() in
rdtgroup_schemata_write(), so quite a distance from where it is needed.

Perhaps I should add a comment at the location where the lock is
required to document where the lock is obtained?

> static int pseudo_lock_l2_l3_portions_valid(struct pseudo_lock_region *plr,
> struct pseudo_lock_portion *l2_p,
> struct pseudo_lock_portion *l3_p)
> {
> unsigned int l2_size, l3_size, size, line_size, cpu;
> struct rdt_domain *l2_d, *l3_d;
>
> l2_d = rdt_find_domain(l2_p->r, l2_p->d_id, NULL);
> if (IS_ERR_OR_NULL(l2_d)) {
> rdt_last_cmd_puts("Cannot locate L2 cache domain\n");
> return -1;
> }
>
> l3_d = rdt_find_domain(l3_p->r, l3_p->d_id, NULL);
> if (IS_ERR_OR_NULL(l3_d)) {
> rdt_last_cmd_puts("Cannot locate L3 cache domain\n");
> return -1;
> }
>
> if (!cpumask_subset(&l2_d->cpu_mask, &l3_d->cpu_mask)) {
> rdt_last_cmd_puts("L2 and L3 caches need to be in same hierarchy\n");
> return -1;
> }
>
> l2_size = rdtgroup_cbm_to_size(l2_p->r, l2_d, l2_p->cbm);
> l3_size = rdtgroup_cbm_to_size(l3_p->r, l3_d, l3_p->cbm);
>
> if (l2_size > l3_size) {
> rdt_last_cmd_puts("L3 cache portion has to be same size or larger than L2 cache portion\n");
> return -1;
> }
>
> size = l2_size;
>
> l2_size = get_cache_line_size(cpumask_first(&l2_d->cpu_mask), l2_p->r->cache_level);
> l3_size = get_cache_line_size(cpumask_first(&l3_d->cpu_mask), l3_p->r->cache_level);
> if (l2_size != l3_size) {
> rdt_last_cmd_puts("L2 and L3 caches have different coherency cache line sizes\n");
> return -1;
> }
>
> line_size = l2_size;
>
> cpu = cpumask_first(&l2_d->cpu_mask);
>
> /*
> * CPU hotplug lock?!?
> */
> if (!cpu_online(cpu)) {
> rdt_last_cmd_printf("CPU %u associated with cache not online\n", cpu);
> return -1;
> }
>
> if (!get_cache_inclusive(cpu, l3_p->r->cache_level)) {
> rdt_last_cmd_puts("L3 cache not inclusive\n");
> return -1;
> }
>
> /*
> * All checks passed, constrain C-states:
> */
> if (pseudo_lock_cstates_constrain(plr, &l2_d->cpu_mask)) {
> rdt_last_cmd_puts("Cannot limit C-states\n");
> pseudo_lock_cstates_relax(plr);
> return -1;
> }
>
> plr->line_size = line_size;
> plr->size = size;
> plr->cpu = cpu;
>
> return 0;
> }
>

Thank you very much

Reinette