Re: [lm-sensors] 3.13.?: Strange / dangerous fan policy...

From: Rafael J. Wysocki
Date: Sat Mar 08 2014 - 07:22:00 EST


On Saturday, March 08, 2014 12:08:31 PM Jean Delvare wrote:
> On Fri, 7 Mar 2014 14:52:30 -0800, Guenter Roeck wrote:
> > On Fri, Mar 07, 2014 at 11:04:29PM +0100, Manuel Krause wrote:
> > > Hi, and thanks for the quick response!
> > > No special fancy "fan control policy". 'fancontrol' isn't up or
> > > running.
> > > Vanilla kernels 3.11.* and 3.12.* had been working on here without
> > > any extra work.
> > > --
> > > # sensors
> > > acpitz-virtual-0
> > > Adapter: Virtual device
> > > temp1: +71.0ÂC (crit = +256.0ÂC)
> > > temp2: +69.0ÂC (crit = +110.0ÂC)
> > > temp3: +52.0ÂC (crit = +105.0ÂC)
> > > temp4: +25.0ÂC (crit = +110.0ÂC)
> > > temp5: +58.0ÂC (crit = +110.0ÂC)
> > >
> > > coretemp-isa-0000
> > > Adapter: ISA adapter
> > > Core 0: +62.0ÂC (high = +105.0ÂC, crit = +105.0ÂC)
> > > Core 1: +60.0ÂC (high = +105.0ÂC, crit = +105.0ÂC)
> > > --
> > > My notebook (HP/Compaq 6730b) does not have a seperate fan sensor.
> > > This is with 3.12.13 with my normal workload.
> > >
> > > Please, trust my above mentionned values of 94 ÂC vs. 74ÂC as I
> > > don't like to boot 3.13.6 anymore, to avoid harm to the notebook's
> > > casing.
> >
> > Understood. Unfortunately, we'll need to get information
> > from the new kernel to be able to track down the problem.
>
> Indeed. Not only the run-time temperatures, but also the high and crit
> limits.
>
> > > But I'd do to test any improvement-patch.
> >
> > So far I have no idea what is going on. I don't see anything in the
> > drivers providing above data that would explain the behavior,
> > but I might be missing something.
>
> Looks like a regression in the acpi subsystem or in power management,
> not hwmon. Hwmon is merely reporting the temperatures, it's not
> responsible for the actual temperatures.
>
> A bisection would certainly help, but of course that would require
> booting to a bad kernel half of the time, which I understand Manual
> wouldn't enjoy.
>
> The only two components which I think can reach such high temperatures
> in a laptop are the CPU and the GPU. I suppose that the "94 ÂC vs.
> 74ÂC" refers to acpitz's temp1? If the the temperatures reported by
> coretemp remain the same, then I can only suppose that temp1 is the GPU
> temperature. Please tell us which GPU is in this laptop, and which
> driver you're using.

Also it would be good to know which cpufreq and cpuidle drivers are in use
and whether or not 3.14-rc5 has the problem.

--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/