Re: [PATCH] ACPI: thermal: Do not call acpi_thermal_check() directly

From: Stephen Berman
Date: Sun Jan 24 2021 - 08:42:28 EST


On Fri, 22 Jan 2021 17:42:59 +0100 "Rafael J. Wysocki" <rafael@xxxxxxxxxx> wrote:

> On Fri, Jan 22, 2021 at 5:39 PM Stephen Berman <stephen.berman@xxxxxxx> wrote:
>>
>> On Fri, 22 Jan 2021 17:23:36 +0100 "Rafael J. Wysocki" <rafael@xxxxxxxxxx> wrote:
>>
>> > On Thu, Jan 14, 2021 at 7:35 PM Rafael J. Wysocki <rjw@xxxxxxxxxxxxx> wrote:
>> >>
>> >> From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
>> >>
>> >> Calling acpi_thermal_check() from acpi_thermal_notify() directly
>> >> is problematic if _TMP triggers Notify () on the thermal zone for
>> >> which it has been evaluated (which happens on some systems), because
>> >> it causes a new acpi_thermal_notify() invocation to be queued up
>> >> every time and if that takes place too often, an indefinite number of
>> >> pending work items may accumulate in kacpi_notify_wq over time.
>> >>
>> >> Besides, it is not really useful to queue up a new invocation of
>> >> acpi_thermal_check() if one of them is pending already.
>> >>
>> >> For these reasons, rework acpi_thermal_notify() to queue up a thermal
>> >> check instead of calling acpi_thermal_check() directly and only allow
>> >> one thermal check to be pending at a time. Moreover, only allow one
>> >> acpi_thermal_check_fn() instance at a time to run
>> >> thermal_zone_device_update() for one thermal zone and make it return
>> >> early if it sees other instances running for the same thermal zone.
>> >>
>> >> While at it, fold acpi_thermal_check() into acpi_thermal_check_fn(),
>> >> as it is only called from there after the other changes made here.
>> >>
>> >> BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=208877
>> >> Reported-by: Stephen Berman <stephen.berman@xxxxxxx>
>> >> Diagnosed-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>
>> >> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
>> >
>> > Well, it's been over a week since this was posted.
>> >
>> > Does anyone have any comments?
>>
>> Sorry, I haven't been able to make time to test the patch yet, but I'll
>> try to do so this weekend. Is it just the patch below that I should
>> apply, ignoring the previous patches you sent?
>
> Yes.
>
>> And can I apply it to the current mainline kernel?
>
> Yes, it should be applicable to the current mainline (at least as of 5.11-rc4).
>
> Thanks!

I've now updated my local repo to 5.11.0-rc4+, installed your patch,
rebuilt and installed the kernel, rebooted (without adding
'thermal.tzp=300' to the kernel command line), did some normal activity,
then ran 'shutdown -h now', and the machine did just that. So your
patch seems to have fixed the problem I reported. Many thanks!

Steve Berman