RE: [RFC] Documentation: Add documentation for new performance_profile sysfs class

From: Limonciello, Mario
Date: Mon Oct 05 2020 - 12:11:31 EST


> 2020. október 5., hétfő 14:58 keltezéssel, Limonciello, Mario írta:
> > > On modern systems CPU/GPU/... performance is often dynamically
> configurable
> > > in the form of e.g. variable clock-speeds and TPD. The performance is
> often
> > > automatically adjusted to the load by some automatic-mechanism (which may
> > > very well live outside the kernel).
> > > These auto performance-adjustment mechanisms often can be configured with
> > > one of several performance-profiles, with either a bias towards low-power
> > > consumption (and cool and quiet) or towards performance (and higher power
> > > consumption and thermals).
> > > Introduce a new performance_profile class/sysfs API which offers a generic
> > > API for selecting the performance-profile of these automatic-mechanisms.
> >
> > If introducing an API for this - let me ask the question, why even let each
> > driver offer a class interface and userspace need to change "each" driver's
> > performance setting?
> >
> > I would think that you could just offer something kernel-wide like
> > /sys/power/performance-profile
> >
> > Userspace can read and write to a single file. All drivers can get notified
> > on this sysfs file changing.
> >
>
> That makes sense, in my opinion, from the regular user's perspective:
> one switch to rule them all, no fuss. However, I don't think that scales well.
> What if the hypothetical users wants to run a CPU-heavy workload, and thus
> wants
> to put the GPU into "low-power" mode and the CPU into "performance" mode? What
> if
> the users wants to put one GPU into "low-power" mode, but the other one into
> "performance"? With the current specification, the user's needs could be
> easily
> satisfied. I don't see how that's possible with a single switch. Nonetheless,
> I think
> that a single global switch *in addition* to the class devices could possibly
> simplify the userspace-kernel interaction for most users.

I think that the moment you represent a platform/system device as a switch you
lose the ability to set different policies for other types of devices separately.

What if using the platform/system device actually /also/ orchestrates changes to
those other devices you mention? Then it becomes an order of events problem, or
worse a problem where one switch shows "wrong" value.

>
>
> > The systems that react in firmware (such as the two that prompted
> > this discussion) can change at that time. It leaves the possibility for a
> > more open kernel implementation that can do the same thing though too by
> > directly modifying device registers instead of ACPI devices.
> >
>
> Excuse my ignorance, but I don't really see why this interface would be tied
> to
> ACPI devices? Why is it not possible to write a driver that implements this
> interface
> and directly modifies device registers? Am I missing something obvious here?
>

When implemented for the two vendors mentioned here, it would be using a
proprietary "firmware API" implemented by those two vendors. For example write
arguments (0x1, 0x2) to ACPI-WMI method WMFT and it will cause firmware to coordinate
using undisclosed protocol to affect the platform changes desirable.

This is different in my mind from "kernel writes to a specific register" to set
power properties of a specific device.