Re: [PATCH] ACPI: cpufreq: use a platform device to load ACPI PPC and PCC drivers

From: Petr Pavlu
Date: Sun Feb 12 2023 - 10:49:34 EST


On 2/9/23 18:20, Rafael J. Wysocki wrote:
> On Tue, Jan 31, 2023 at 2:01 PM Petr Pavlu <petr.pavlu@xxxxxxxx> wrote:
>>
>> The acpi-cpufreq and pcc-cpufreq drivers are loaded through per-CPU
>> module aliases. This can result in many unnecessary load requests during
>> boot if another frequency module, such as intel_pstate, is already
>> active. For instance, on a typical Intel system, one can observe that
>> udev makes 2x#CPUs attempts to insert acpi_cpufreq and 1x#CPUs attempts
>> for pcc_cpufreq. All these tries then fail if another frequency module
>> is already registered.
>
> Right, which is unnecessary overhead.
>
>> Both acpi-cpufreq and pcc-cpufreq drivers have their platform firmware
>> interface defined by ACPI. Allowed performance states and parameters
>> must be same for each CPU.
>
> This is an assumption made by the code in those drivers, the
> specification doesn't actually require this IIRC.

That statement is based on the ACPI Specification Version 6.5, section 8.4.5.
Processor Performance Control [1]:

"In a multiprocessing environment, all CPUs must support the same number of
performance states and each processor performance state must have identical
performance and power-consumption parameters. Performance objects must be
present under each processor object in the system for OSPM to utilize this
feature."

The same paragraph appears already in ACPI Specification Version 2.0 [2],
section 8.3.3 Processor Performance Control, where PPC was first added.

PCC [3] is simpler and uses a single header to describe properties
(frequencies) for all processors.

>> This makes it possible to model these
>> interfaces as platform devices.
>>
>> The patch extends the ACPI parsing logic to check the ACPI namespace if
>> the PPC or PCC interface is present and creates a virtual platform
>> device for each if it is available. The acpi-cpufreq and pcc-cpufreq
>> drivers are then updated to map to these devices.
>>
>> This allows to try loading acpi-cpufreq and pcc-cpufreq only once during
>> boot and only if a given interface is available in the firmware.
>
> This is clever, but will it always work?

I'm not sure if you have any specific problem in mind, for example, some
systems with broken ACPI tables?

When it comes to testing, I ran the patch on a few older Intel machines with
PPC and one with PCC. The acpi-cpufreq and pcc-cpufreq drivers were tested in
both built-in and loadable module configurations. It was checked that
intel_pstate remains the preferred option and is attempted to be initialized
first.

It looks though I should have also done testing on some newer machine as
I missed the ACPI0007 scenario you point out below.

>
>> Signed-off-by: Petr Pavlu <petr.pavlu@xxxxxxxx>
>> ---
>> drivers/acpi/Makefile | 1 +
>> drivers/acpi/acpi_cpufreq.c | 49 ++++++++++++++++++++++++++++++++++
>> drivers/acpi/bus.c | 1 +
>> drivers/acpi/internal.h | 2 ++
>> drivers/cpufreq/acpi-cpufreq.c | 39 +++++++++++++++------------
>> drivers/cpufreq/pcc-cpufreq.c | 34 ++++++++++++++++-------
>> 6 files changed, 99 insertions(+), 27 deletions(-)
>> create mode 100644 drivers/acpi/acpi_cpufreq.c
>>
>> diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile
>> index feb36c0b9446..880db1082c3e 100644
>> --- a/drivers/acpi/Makefile
>> +++ b/drivers/acpi/Makefile
>> @@ -57,6 +57,7 @@ acpi-y += evged.o
>> acpi-y += sysfs.o
>> acpi-y += property.o
>> acpi-$(CONFIG_X86) += acpi_cmos_rtc.o
>> +acpi-$(CONFIG_X86) += acpi_cpufreq.o
>> acpi-$(CONFIG_X86) += x86/apple.o
>> acpi-$(CONFIG_X86) += x86/utils.o
>> acpi-$(CONFIG_X86) += x86/s2idle.o
>> diff --git a/drivers/acpi/acpi_cpufreq.c b/drivers/acpi/acpi_cpufreq.c
>> new file mode 100644
>> index 000000000000..7cf243c67475
>> --- /dev/null
>> +++ b/drivers/acpi/acpi_cpufreq.c
>> @@ -0,0 +1,49 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * Registration of platform devices for ACPI Processor Performance Control and
>> + * Processor Clocking Control.
>> + */
>> +
>> +#include <linux/acpi.h>
>> +#include <linux/platform_device.h>
>> +
>> +#include "internal.h"
>> +
>> +static void __init cpufreq_add_device(const char *name)
>> +{
>> + struct platform_device *pdev;
>> +
>> + pdev = platform_device_register_simple(name, PLATFORM_DEVID_NONE, NULL,
>> + 0);
>> + if (IS_ERR(pdev))
>> + pr_err("%s device creation failed: %ld\n", name, PTR_ERR(pdev));
>> +}
>> +
>> +static acpi_status __init acpi_pct_match(acpi_handle handle, u32 level,
>> + void *context, void **return_value)
>> +{
>> + bool *pct = context;
>> +
>> + /* Check if the first CPU has _PCT. The data must be same for all. */
>> + *pct = acpi_has_method(handle, "_PCT");
>> + return AE_CTRL_TERMINATE;
>> +}
>> +
>> +void __init acpi_cpufreq_init(void)
>> +{
>> + acpi_status status;
>> + acpi_handle handle;
>> + bool pct = false;
>> +
>> + status = acpi_get_handle(NULL, "\\_SB", &handle);
>> + if (ACPI_FAILURE(status))
>> + return;
>> +
>> + acpi_walk_namespace(ACPI_TYPE_PROCESSOR, ACPI_ROOT_OBJECT,
>> + ACPI_UINT32_MAX, acpi_pct_match, NULL, &pct, NULL);
>
> Well, one more full ACPI Namespace walk at init time.

The code tries to do only a partial walk of the ACPI namespace by terminating
at the first found processor node.

> Also, this only covers processor objects and what about processor
> device objects?

I'll extend the scan to cover also ACPI0007 devices.

[1] https://uefi.org/specs/ACPI/6.5/08_Processor_Configuration_and_Control.html#processor-performance-control
[2] https://uefi.org/sites/default/files/resources/ACPI_2.pdf
[3] https://acpica.org/sites/acpica/files/Processor-Clocking-Control-v1p0.pdf

Thanks,
Petr