Re: [PATCH 8/8] x86/intel_rdt/mba: Add schemata file support for MBA

From: Thomas Gleixner
Date: Wed Mar 01 2017 - 11:59:41 EST


On Fri, 17 Feb 2017, Vikas Shivappa wrote:
> @@ -13,6 +13,7 @@
> #define IA32_L2_CBM_BASE 0xd10
> #define IA32_MBA_THRTL_BASE 0xd50
> #define MAX_MBA_THRTL 100u
> +#define INVALID_DELAY 100u

100% is an interesting definition of invalid.

> #define MBA_IS_LINEAR 0x4
>
> #define L3_QOS_CDP_ENABLE 0x01ULL
> @@ -157,7 +158,9 @@ struct msr_param {
> void rdt_get_cache_infofile(struct rdt_resource *r);
> void rdt_get_mba_infofile(struct rdt_resource *r);
> int parse_cbm(char *buf, struct rdt_resource *r);
> +int parse_thrtl(char *buf, struct rdt_resource *r);

Bah. The resource is MBA not 'thrtl'. Consistent naming is NOT optional.

> void cqm_wrmsr(void *a1, void *a2, struct rdt_resource *r);
> +void mba_wrmsr(void *a1, void *a2, struct rdt_resource *r);

Again this is global, because it's only used in intel_rdt.c, right?

> /*
> + * Map the memory b/w percentage value to delay values
> + * that can be written to QOS_MSRs.
> + * There are currently no SKUs which support non linear delay values.
> + */
> +static u32 delay_bw_map(unsigned long bw, struct rdt_resource *r)
> +{
> + if (r->delay_linear)
> + return MAX_MBA_THRTL - bw;

Bah. Why do you have r->default_ctrl? Just to hard code stuff in places?

> +
> + return INVALID_DELAY;

This is crap. If the code reaches this point, then the map has been
initialized. If the map has not been initialized, but delay_linear is 0
then that's a BUG and we should handle it gracefully.

It's fine to prepare for the eventual arrival of this non linear nonsense,
but then we do it proper.

if (r->delay_linear)
return r->default_ctrl - bw;

if (!r->bw_map) {
WARN_ONCE("Sensible error message");
return r->default_ctrl;
}
WARN_ONCE("Sensible error message");
return r->default_ctrl;

Whether this needs to be a WARN_ONCE() or a pr_err_once() is debatable as
the call chain is known and the only value of the backtrace is that it is
more prominent in dmesg.

> +void mba_wrmsr(void *a1, void *a2, struct rdt_resource *r)
> +{
> + struct rdt_domain *d = (struct rdt_domain *)a2;
> + struct msr_param *m = (struct msr_param *)a1;
> + int i;
> +
> + for (i = m->low; i < m->high; i++) {
> + int idx = cbm_idx(r, i);
> +
> + /*
> + * Write the delay value for mba.
> + * delay_bw_map will return a correct value

If you use function names in a comment then please do it proper:
delay_bw_map()

> + * for this call as a nonexistant map throws error
> + * on init.

Oh well. Adding nonsensical comments is way simpler than thinking about
proper implementations, right?

The function as you wrote it does not return a correct value under all
circumstances and the nonexistant map does not throw an error at all. We
simply should not reach that code when the map does not exist.

> /*
> + * Check whether MBA bandwidth percentage value is correct.
> + * The value is checked against the minimum bandwidth
> + * values and the b/w granularity specified by the h/w.
> + */
> +static int thrtl_validate(char *buf, unsigned long *data, struct rdt_resource *r)

Darn. The input value is not a throttle value. Read the comment you wrote
yourself. It's a bandwidth percentage. So why the heck can't you make the
function name reflect what it does?

> +{
> + unsigned long bw;
> + int ret;
> +
> + ret = kstrtoul(buf, 10, &bw);
> + if (ret)
> + return ret;
> +
> + if (bw < r->min_bw || bw > MAX_MBA_THRTL || (bw % r->bw_gran))

I told you last time, that requiring exact matches is bad because you can't
use the same settings across different machines. It does not matter whether
it's accurate simply because the actual outcome in the hardware is not
accurate either. So the only interesting thing is that:

r->min_bw <= val <= MAX_MBA_BW

Yes, it's not MAX_MBA_THRTL, it's MAX_MBA_BW, which is always 100%. Please
fix that all over the place.

In principle, we could even relax the requirement to:

0 <= val <= MAX_MBA_BW

and set the value to r->min_bw if it's smaller.

And of course this granularity check is completely bogus when non-linear
mappings come into play. So while you have plastered the rest of the code
with bogus handling of the non-linear stuff, you ignore it here.

A simple

if (!r->delay_linear)
return -ENOTSUPP;

would be the proper thing to do.

> +/*
> + * Read the user RDT control value into tempory buffer:
> + * Cache bit mask (hex) or Memory b/w throttle (decimal).

That's a really valuable comment for this function. NOT!

I really wonder how you manage to not confuse yourself with the
inconsistent mess you create all over the place. Maybe you do....

Thanks,

tglx