Re: [PATCH v2 4/7] platform/x86/intel/ifs: Implement Array BIST test

From: Dave Hansen
Date: Wed Feb 15 2023 - 11:58:52 EST


On 2/14/23 15:44, Jithu Joseph wrote:
> +/* MSR_ARRAY_BIST bit fields */
> +union ifs_array {
> + u64 data;
> + struct {
> + u32 array_bitmask :32;
> + u32 array_bank :16;
> + u32 rsvd :15;
> + u32 ctrl_result :1;
> + };
> +};

Wow, after all that, the bitfield remains! Even the totally unnecessary
parts, like the 32-bit wide bitfield in the 32-bit value. The *LEAST*
you can do is this:

struct ifs_array {
u32 array_bitmask;
u16 array_bank
u16 rsvd :15;
u16 ctrl_result :1;
};

to at least minimize the *ENTIRELY* unnecessary bitfields. I'm also not
quite sure why I'm going to the trouble of writing another one of these
things since the last set of things I suggested were not incorporated.
Or, what the obsession is with u32.

I also think the union is a bit silly. You could literally just do:

msrvals[0] = (u64 *)&activate;
or
memcpy(&msrvals[0], &activate, sizeof(activate));

and probably end up with exactly the same instructions. There's no type
safety here either way. The cast, for instance, at least makes the lack
of type safety obvious.

> +static void ifs_array_test_core(int cpu, struct device *dev)
> +{
> + union ifs_array activate, status = {0};

So, 'status' here is initialized to 0. But, 'activate'... hmmm

Here's 1 of the 4 fields getting initialized:

> + activate.array_bitmask = ~0U;
> + timeout = jiffies + HZ / 2;
> +
> + do {
> + if (time_after(jiffies, timeout)) {
> + timed_out = true;
> + break;
> + }
> +
> + msrvals[0] = activate.data;

and then the *WHOLE* union is read here. What *is* the uninitialized
member behavior of a bitfield? I actually haven't the foggiest idea
since I never use them. Is there some subtly C voodoo that initializes
the other 3 fields?

BTW, ifs_test_core() seems to be doing basically the same thing with
ifs_status.

> + atomic_set(&array_cpus_out, 0);
> + stop_core_cpuslocked(cpu, do_array_test, msrvals);
> + status.data = msrvals[1];
> +
> + if (status.ctrl_result)
> + break;
> +
> + activate.array_bitmask = status.array_bitmask;
> + activate.array_bank = status.array_bank;
> +
> + } while (status.array_bitmask);
> +
> + ifsd->scan_details = status.data;

Beautiful. It passes raw MSR values back out to userspace in sysfs.

OK, so all this union.data nonsense is to, in the end, pass two MSR
values through an array over to the do_array_test() function. That
function, in the end, just does:

+ if (cpu == first) {
+ wrmsrl(MSR_ARRAY_BIST, msrs[0]);
+ /* Pass back the result of the test */
+ rdmsrl(MSR_ARRAY_BIST, msrs[1]);
+ }

with them. It doesn't even reuse msrs[0]. It also doesn't bother to
even give them symbolic names, like command and result. The msrs[]
values are also totally hard coded.

I'd probably do something like the attached patch. It gets rid of
'data' and uses sane types for the bitfield. It does away with separate
variables and munging into/out of the msr[] array and just passes a
single command struct to the work function. It doesn't have any
uninitialized structure/bitfield fields.

Oh, and it's less code.

> + if (status.ctrl_result)
> + ifsd->status = SCAN_TEST_FAIL;
> + else if (timed_out || status.array_bitmask)
> + ifsd->status = SCAN_NOT_TESTED;
> + else
> + ifsd->status = SCAN_TEST_PASS;
> +}
> +
> /*
> * Initiate per core test. It wakes up work queue threads on the target cpu and
> * its sibling cpu. Once all sibling threads wake up, the scan test gets executed and
> @@ -253,6 +341,8 @@ int do_core_test(int cpu, struct device *dev)
> ifs_test_core(cpu, dev);
> break;
> case IFS_ARRAY:
> + ifs_array_test_core(cpu, dev);
> + break;
> default:
> return -EINVAL;
> }
ifs.h | 13 +++++--------
runtest.c | 34 +++++++++++++++-------------------
2 files changed, 20 insertions(+), 27 deletions(-)