Re: [PATCH v3 6/8] platform/x86/intel/ifs: Implement Array BIST test

From: Hans de Goede
Date: Thu Mar 16 2023 - 06:01:30 EST


Hi,

On 3/13/23 17:37, Joseph, Jithu wrote:
> Hans,
>
> Thank-you very much for the review.
>
> On 3/13/2023 9:24 AM, Hans de Goede wrote:
>
>
>>>
>>> +#define SPINUNIT 100 /* 100 nsec */
>>> +static atomic_t array_cpus_out;
>>
>> This variable is only inc-ed + read, it is never reset to 0
>> so the "while (atomic_read(t) < all_cpus)"
>> check only works for the first test run.
>>
>
> It is reset to zero as annotated below. Let me know if this doesn't address your concern.

Ah, I somehow missed that despite looking for it twice, yes that is fine.

>> Also even static atomic_t variables must be initialized, you cannot
>> assume that using using zeroed mem is a valid value for an atomic_t.
>>
>> And this is also shared between all smt pairs, so if 2 "real"
>> CPU cores with both 2 sibblings are asked to run IFS tests at
>> the same time, then array_cpus_out will get increased 4 times
>> in total, breaking the wait_for_sibbling loop as soon as
>> the counter reaches 2, so before the tests are done.
>
> Only one IFS test is allowed at a time. This is done using "ifs_sem" defined in sysfs.c

Ah I see.

After taking a closer look I do see one unrelated issue with this patch
sysfs.c: run_test_store() does:

if (!ifsd->loaded)
rc = -EPERM;
else
rc = do_core_test(cpu, dev);

But AFAICT the loaded check really only applies to the first (intel_ifs_0 device) test type and the
Array BIST test should work fine when loaded is false.

So I think that the if (!ifsd->loaded) error check should be moved to
ifs_test_core() ?

Regards,

Hans


>
> ...
>
>>> +static void ifs_array_test_core(int cpu, struct device *dev)
>>> +{
>>> + union ifs_array command = {};
>>> + bool timed_out = false;
>>> + struct ifs_data *ifsd;
>>> + unsigned long timeout;
>>> +
>>> + ifsd = ifs_get_data(dev);
>>> +
>>> + command.array_bitmask = ~0U;
>>> + timeout = jiffies + HZ / 2;
>>> +
>>> + do {
>>> + if (time_after(jiffies, timeout)) {
>>> + timed_out = true;
>>> + break;
>>> + }
>>> + atomic_set(&array_cpus_out, 0);
>
> The above line is where the zero initialization happens before every test.
>
>>> + stop_core_cpuslocked(cpu, do_array_test, &command);
>>> +
>>> + if (command.ctrl_result)
>>> + break;
>>> + } while (command.array_bitmask);
>>> +
>>> + ifsd->scan_details = command.data;
>>> +
>>> + if (command.ctrl_result)
>>> + ifsd->status = SCAN_TEST_FAIL;
>>> + else if (timed_out || command.array_bitmask)
>>> + ifsd->status = SCAN_NOT_TESTED;
>>> + else
>>> + ifsd->status = SCAN_TEST_PASS;
>>> +}
>
> Jithu
>