Re: [PATCH 04/19] cpufreq: amd: introduce a new amd pstate driver to support future processors

From: Peter Zijlstra
Date: Mon Sep 13 2021 - 07:56:48 EST


On Mon, Sep 13, 2021 at 06:54:58PM +0800, Huang Rui wrote:
> On Mon, Sep 13, 2021 at 04:56:24PM +0800, Peter Zijlstra wrote:

> > > 1) Full MSR Support
> > > If current hardware has the full MSR support, we register "pstate_funcs"
> > > callback functions to implement the MSR operations to control the clocks.
> >
> > What's the WRMSR cost for those? I've not really kept track of the MSR
> > costs on AMD platforms, but on Intel it has (luckily) been coming down
> > quite a bit.
>
> Good to know this, I didn't have a chance to give a check. May I know how
> did you test this latency? But MSR is new hardware design for this
> solution, as designer mentioned, the WRMSR is low-latency register model is
> faster than ACPI AML code interpreter.
>
> >
> > > 2) Shared Memory Support
> > > If current hardware doesn't have the full MSR support, that means it only
> > > provides share memory support. We will leverage APIs in cppc_acpi libs with
> > > "cppc_funcs" to implement the target function for the frequency control.
> >
> > Right, the mailbox thing. How is the performance of this vs MSR accesses?
>
> I will give a check. If you have a existing test method that can be used, I
> can check it quickly.

Oh, I was mostly wondering if using the mailbox as MMIO would be faster
than an MSR, but you've already answered that above. Also:

> > > 1. As mentioned above, amd-pstate driver can implement
> > > fast_switch/adjust_perf function with full MSR operations that have better
> > > performance for schedutil and other governors.
> >
> > Why couldn't the existing cppc-cpufreq grow this?
>
> Because fast_switch can adjust the frequency directly in the interrupt
> context, if we use the acpi cppc handling with shared memory solution, it
> will have a deadlock. So fast switch needs the control with registers
> directly like acpi-cpufreq and intel-pstate.

Aah, I see, you're only doing fast_switch support when you have MSRs.
That was totally non-obvious.. :/

But then amd_pstate_adjust_perf() could just direct call the pstate
methods and we don't need that indirection *at*all*, right?