Re: [PATCH linux v9 2/5] hwmon: occ: Add sysfs interface

From: Eddie James
Date: Fri Apr 28 2017 - 11:23:17 EST




On 04/02/2017 06:19 AM, Guenter Roeck wrote:
On 03/14/2017 01:55 PM, Eddie James wrote:
From: "Edward A. James" <eajames@xxxxxxxxxx>

Add a generic mechanism to expose the sensors provided by the OCC in
sysfs.

Signed-off-by: Edward A. James <eajames@xxxxxxxxxx>
Signed-off-by: Andrew Jeffery <andrew@xxxxxxxx>
---
Documentation/hwmon/occ | 62 +++++++++++
drivers/hwmon/occ/Makefile | 2 +-
drivers/hwmon/occ/occ_sysfs.c | 253 ++++++++++++++++++++++++++++++++++++++++++
drivers/hwmon/occ/occ_sysfs.h | 25 +++++
4 files changed, 341 insertions(+), 1 deletion(-)
create mode 100644 drivers/hwmon/occ/occ_sysfs.c
create mode 100644 drivers/hwmon/occ/occ_sysfs.h

diff --git a/Documentation/hwmon/occ b/Documentation/hwmon/occ
index d1c863b..580af26 100644
--- a/Documentation/hwmon/occ
+++ b/Documentation/hwmon/occ
@@ -27,6 +27,68 @@ Currently, all versions of the OCC support four types of sensor data: power,
temperature, frequency, and "caps," which indicate limits and thresholds used
internally on the OCC.

+sysfs Entries
+-------------
+
+The OCC driver uses the hwmon sysfs framework to provide data to userspace.
+
+The driver exports a number of sysfs files for each type of sensor. The
+sensor-specific files vary depending on the processor type, though many of the
+attributes are common for both the POWER8 and POWER9.
+
+The hwmon interface cannot define every type of sensor that may be used.
+Therefore, the frequency sensor on the OCC uses the "input" type sensor defined
+by the hwmon interface, rather than defining a new type of custom sensor.
+
+Below are detailed the names and meaning of each sensor file for both types of
+processors. All sensors are read-only unless otherwise specified. <x> indicates
+the hwmon index. sensor id indicates the unique internal OCC identifer. Please
+see the POWER OCC specification for details on all these sensor values.
+
+frequency:
+ all processors:
+ in<x>_input - frequency value
+ in<x>_label - sensor id
+temperature:
+ POWER8:
+ temp<x>_input - temperature value
+ temp<x>_label - sensor id
+ POWER9 (in addition to above):
+ temp<x>_type - FRU type
+
+power:
+ POWER8:
+ power<x>_input - power value
+ power<x>_label - sensor id
+ power<x>_average - accumulator
+ power<x>_average_interval - update tag (number of samples in
+ accumulator)
+ POWER9:
+ power<x>_input - power value
+ power<x>_label - sensor id
+ power<x>_average_min - accumulator[0]
+ power<x>_average_max - accumulator[1] (64 bits total)
+ power<x>_average_interval - update tag
+ power<x>_reset_history - (function_id | (apss_channel << 8)
+
+caps:
+ POWER8:
+ power<x>_cap - current powercap
+ power<x>_cap_max - max powercap
+ power<x>_cap_min - min powercap
+ power<x>_max - normal powercap
+ power<x>_alarm - user powercap, r/w
+ POWER9:
+ power<x>_cap_alarm - user powercap source
+
+The driver also provides two sysfs entries through hwmon to better
+control the driver and monitor the master OCC. Though there may be multiple
+OCCs present on the system, these two files are only present for the "master"
+OCC.
+ name - read the name of the driver
+ update_interval - read or write the minimum interval for polling the
+ OCC.
+
BMC - Host Communications
-------------------------

diff --git a/drivers/hwmon/occ/Makefile b/drivers/hwmon/occ/Makefile
index 3ed79a5..67b5367 100644
--- a/drivers/hwmon/occ/Makefile
+++ b/drivers/hwmon/occ/Makefile
@@ -1 +1 @@
-obj-$(CONFIG_SENSORS_IBM_OCC) += occ.o
+obj-$(CONFIG_SENSORS_IBM_OCC) += occ.o occ_sysfs.o
diff --git a/drivers/hwmon/occ/occ_sysfs.c b/drivers/hwmon/occ/occ_sysfs.c
new file mode 100644
index 0000000..50b20e2
--- /dev/null
+++ b/drivers/hwmon/occ/occ_sysfs.c
@@ -0,0 +1,253 @@
+/*
+ * occ_sysfs.c - OCC sysfs interface
+ *
+ * This file contains the methods and data structures for implementing the OCC
+ * hwmon sysfs entries.
+ *
+ * Copyright 2017 IBM Corp.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <linux/delay.h>
+#include <linux/device.h>
+#include <linux/err.h>
+#include <linux/hwmon.h>
+#include <linux/hwmon-sysfs.h>
+#include <linux/init.h>
+#include <linux/jiffies.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/slab.h>
+#include "occ.h"
+#include "occ_sysfs.h"
+
+#define OCC_HWMON_NAME_LENGTH 32
+
+struct occ_sysfs {
+ struct device *dev;
+ struct occ *occ;
+
+ char label_buffer[OCC_HWMON_NAME_LENGTH + 1];
+ char hwmon_name[OCC_HWMON_NAME_LENGTH + 1];
+ const u32 *sensor_hwmon_configs;
+ struct hwmon_channel_info **occ_sensors;
+ struct hwmon_chip_info occ_info;
+ u16 user_powercap;
+};
+
+static int occ_hwmon_read(struct device *dev, enum hwmon_sensor_types type,
+ u32 attr, int channel, long *val)
+{
+ int rc;
+ struct occ_sysfs *driver = dev_get_drvdata(dev);
+ struct occ *occ = driver->occ;
+
+ switch (type) {
+ case hwmon_in:
+ rc = occ_get_sensor_field(occ, FREQ, channel, attr, val);
+ break;
+ case hwmon_temp:
+ rc = occ_get_sensor_field(occ, TEMP, channel, attr, val);
+ break;
+ case hwmon_power:
+ rc = occ_get_sensor_field(occ, POWER, channel, attr, val);
+ break;
+ default:
+ rc = -EOPNOTSUPP;
+ }
+
+ return rc;
+}
+
+static int occ_hwmon_read_string(struct device *dev,
+ enum hwmon_sensor_types type, u32 attr,
+ int channel, const char **str)
+{
+ int rc;
+ unsigned long val = 0;
+ struct occ_sysfs *driver = dev_get_drvdata(dev);
+
+ if (!((type == hwmon_in && attr == hwmon_in_label) ||
+ (type == hwmon_temp && attr == hwmon_temp_label) ||
+ (type == hwmon_power && attr == hwmon_power_label)))
+ return -EOPNOTSUPP;
+
+ /* will fetch the "label", the sensor_id */
+ rc = occ_hwmon_read(dev, type, attr, channel, &val);
+ if (rc < 0)
+ return rc;
+
+ /* just use one label buffer for all sensors. works with current hwmon
+ * implementation. only alternative is to store a buffer for each
+ * sensor, which gets expensive quickly.
+ */

Sorry for the late reply.

No, this doesn't work and is racy. Reading all labels from multiple threads
in parallel will result in random data.

The label is supposed to be constant for each sensor. If it isn't, it is not
a label. Either create it as constant and use the generated string, or drop
the attribute.

Guenter

Thanks for catching that. The driver is undergoing some significant restructuring as a result of changes in design to the low-level driver for the P9 side of things (actually doing the communication with the OCC). Depending on how extensive the changes are (haven't had time to finalize), this patchset could be abandoned and I'll start again.

Thanks for the feedback so far.

Eddie