Re: [PATCH v2] nvme: Add hardware monitoring support

From: Guenter Roeck
Date: Wed Nov 06 2019 - 17:30:51 EST


On Wed, Nov 06, 2019 at 10:29:21PM +0100, Pavel Machek wrote:
> Hi!
>
> > > nvme devices report temperature information in the controller information
> > > (for limits) and in the smart log. Currently, the only means to retrieve
> > > this information is the nvme command line interface, which requires
> > > super-user privileges.
> > >
> > > At the same time, it would be desirable to use NVME temperature information
> > > for thermal control.
> > >
> > > This patch adds support to read NVME temperatures from the kernel using the
> > > hwmon API and adds temperature zones for NVME drives. The thermal subsystem
> > > can use this information to set thermal policies, and userspace can access
> > > it using libsensors and/or the "sensors" command.
> > >
> > > Example output from the "sensors" command:
> > >
> > > nvme0-pci-0100
> > > Adapter: PCI adapter
> > > Composite: +39.0°C (high = +85.0°C, crit = +85.0°C)
> > > Sensor 1: +39.0°C
> > > Sensor 2: +41.0°C
> > >
> > > Signed-off-by: Guenter Roeck <linux@xxxxxxxxxxxx>
> >
> > This looks fine to me, but I'll wait a few more days to see if there are
> > any additional comments..
>
> User wants to know temperature of /dev/sda... and we already have an
> userspace tools knowing about smart, etc...
>
> pavel@amd:/data/film$ sudo hddtemp /dev/sda
> /dev/sda: ST1000LM014-1EJ164: 48°C
>
> I see we also have sensors framework but it does _not_ handle
> harddrive temperatures.
>
> Does it need some kind of unification? Should NVMe devices expose
> "SMART" information in the same way other SSDs do?
>

The unification to report hardware monitoring information to userspace
is called the sensors framework. Also, users in general prefer to not
have to run "sudo" to get such information.

Guenter