Re: [PATCH v8 15/15] x86/split_lock: Add a sysfs interface to enable/disable split lock detection during run time

From: Ingo Molnar
Date: Thu Apr 25 2019 - 02:31:24 EST



* Fenghua Yu <fenghua.yu@xxxxxxxxx> wrote:

> To workaround or debug a split lock issue, the administrator may need to
> disable or enable split lock detection during run time without rebooting
> the system.
>
> The interface /sys/device/system/cpu/split_lock_detect is added to allow
> the administrator to disable or enable split lock detection and show
> current split lock detection setting.
>
> Writing [yY1] or [oO][nN] to the file enables split lock detection and
> writing [nN0] or [oO][fF] disables split lock detection. Split lock
> detection is enabled or disabled on all CPUs.
>
> Reading the file returns current global split lock detection setting:
> 0: disabled
> 1: enabled
>
> Add an ABI document entry for /sys/devices/system/cpu/split_lock_detect.
>
> Signed-off-by: Fenghua Yu <fenghua.yu@xxxxxxxxx>
> ---
> Not sure if the justification for the sysfs knob is valid. If not, this
> patch could be removed from this patch set.
>
> .../ABI/testing/sysfs-devices-system-cpu | 22 ++++++++
> arch/x86/kernel/cpu/intel.c | 52 ++++++++++++++++++-
> 2 files changed, 72 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
> index 9605dbd4b5b5..aad7b1698065 100644
> --- a/Documentation/ABI/testing/sysfs-devices-system-cpu
> +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
> @@ -67,6 +67,28 @@ Description: Discover NUMA node a CPU belongs to
> /sys/devices/system/cpu/cpu42/node2 -> ../../node/node2
>
>
> +What: /sys/devices/system/cpu/split_lock_detect
> +Date: March 2019
> +Contact: Linux kernel mailing list <linux-kernel@xxxxxxxxxxxxxxx>
> +Description: (RW) Control split lock detection on Intel Tremont and
> + future CPUs
> +
> + Reads return split lock detection status:
> + 0: disabled
> + 1: enabled
> +
> + Writes enable or disable split lock detection:
> + The first character is one of 'Nn0' or [oO][fF] for off
> + disables the feature.
> + The first character is one of 'Yy1' or [oO][nN] for on
> + enables the feature.
> +
> + Please note the interface only shows or controls global setting.
> + During run time, split lock detection on one CPU may be
> + disabled if split lock operation in kernel code happens on
> + the CPU. The interface doesn't show or control split lock
> + detection on individual CPU.

I.e. implementation and possible actual state are out of sync. Why?

Also, if it's a global flag, why waste memory on putting a sysfs knob
into every CPU's sysfs file?

Finally, why is a debugging facility in sysfs, why not a debugfs knob?
Using a sysctl would solve the percpu vs. global confusion as well ...

> --- a/arch/x86/kernel/cpu/intel.c
> +++ b/arch/x86/kernel/cpu/intel.c
> @@ -35,6 +35,7 @@
> DEFINE_PER_CPU(u64, msr_test_ctl_cache);
> EXPORT_PER_CPU_SYMBOL_GPL(msr_test_ctl_cache);
>
> +static DEFINE_MUTEX(split_lock_detect_mutex);
> static bool split_lock_detect_enable;

'enable' is a verb in plain form - which we use for function names.

For variable names that denotes current state we typically use past
tense, i.e. 'enabled'.

(The only case where we'd us the split_lock_detect_enable name for a flag
if it's a flag to trigger some sort of enabling action - which this
isn't.)

Please review the whole series for various naming mishaps.

> + mutex_lock(&split_lock_detect_mutex);
> +
> + split_lock_detect_enable = val;
> +
> + /* Update the split lock detection setting in MSR on all online CPUs. */
> + on_each_cpu(split_lock_update_msr, NULL, 1);
> +
> + if (split_lock_detect_enable)
> + pr_info("enabled\n");
> + else
> + pr_info("disabled\n");
> +
> + mutex_unlock(&split_lock_detect_mutex);

Instead of a mutex, please just use the global atomic debug flag which
controls the warning printout. By using that flag both for the WARN()ing
and for controlling MSR state all the races are solved and the code is
simplified.


Thanks,

Ingo