Re: [PATCH 1/4] arm64: dts: rockchip: list all CPU supplies on ArmSoM Sige5

From: Alexey Charkov
Date: Sat Jun 21 2025 - 15:36:17 EST


On Fri, Jun 20, 2025 at 8:02 PM Alexey Charkov <alchark@xxxxxxxxx> wrote:
>
> On Wed, Jun 18, 2025 at 6:48 PM Alexey Charkov <alchark@xxxxxxxxx> wrote:
> >
> > On Wed, Jun 18, 2025 at 6:06 PM Nicolas Frattaroli
> > <nicolas.frattaroli@xxxxxxxxxxxxx> wrote:
> > >
> > > Hello,
> > >
> > > +Cc Jonas Karlman as he is intimately familiar with RK3576 clock shenanigans by now,
> > >
> > > On Wednesday, 18 June 2025 15:51:45 Central European Summer Time Alexey Charkov wrote:
> > > > On Sun, Jun 15, 2025 at 8:00 PM Piotr Oniszczuk
> > > > <piotr.oniszczuk@xxxxxxxxx> wrote:
> > > > >
> > > > >
> > > > >
> > > > > > Wiadomość napisana przez Alexey Charkov <alchark@xxxxxxxxx> w dniu 9 cze 2025, o godz. 16:05:
> > > > > >
> > > > > > On Sun, Jun 8, 2025 at 11:24 AM Piotr Oniszczuk
> > > > > > <piotr.oniszczuk@xxxxxxxxx> wrote:
> > > > > >>> Wiadomość napisana przez Alexey Charkov <alchark@xxxxxxxxx> w dniu 5 cze 2025, o godz. 15:42:
> > > > > >>>> Alexey,
> > > > > >>>> I see you are using rk3576 board like me (nanopi-m5)
> > > > > >>>> Have you on your board correctly working cpu dvfs?
> > > > > >>>> I mean: [1][desired clocks reported by kernel sysfs are in pair with [2[]cur clocks?
> > > > > >>>> In my case i see mine cpu lives totally on it’s own with dvfs:
> > > > > >>>
> > > > > >>> Hi Piotr,
> > > > > >>>
> > > > > >>> I haven't tried to validate actual running frequencies vs. requested
> > > > > >>> frequencies, but subjective performance and power consumption seem to
> > > > > >>> be in line with what I expect.
> > > > > >>
> > > > > >> well - my subjective l&f is that - currently - my rk3576 seems „slower" than i.e. 4xA53 h618.
> > > > > >
> > > > > > In my experience, native compilation of GCC 14 using 8 threads on
> > > > > > RK3576 (mainline with passive cooling and throttling enabled): 2 hours
> > > > > > 6 minutes, on RK3588 (mainline with passive cooling via Radxa Rock 5B
> > > > > > case and throttling enabled but never kicking in): 1 hour 10 minutes
> > > > >
> > > > > by curiosity i looked randomly on 3576 vs 3588:
> > > > > multithread passmark: 3675 (https://www.cpubenchmark.net/cpu.php?cpu=Rockchip+RK3576&id=6213)
> > > > > multithread passmark: 4530 (https://www.cpubenchmark.net/cpu.php?cpu=Rockchip+RK3588&id=4906)
> > > > >
> > > > > assuming 3588 as baseline, 3576 is approx 20% slower on multithread passmark (has ~0,8 comp power of 3588)
> > > > > 70 min compile on 3588 should take something like ~86min on 3576.
> > > > > In your case 126min compile on 3576 shows 3576 offers 0,55 comp power of 3588.
> > > > > Roughly 3576 should do this task in 40min less than you currently see i think
> > > > >
> > > > >
> > > > > > Can't see how u-boot would affect CPU speed in Linux, as long as you
> > > > > > use comparable ATF images. Do you use the same kernel and dtb in all
> > > > > > these cases? Also, what's your thermal setup?
> > > > >
> > > > > yes. in all cases only change was: uboot & atf
> > > > > thermal is based on recent collabora series (+ recent pooling fix for clocks return from throttling)
> > > > >
> > > > > >
> > > > > >
> > > > > > Not sure UX is a particularly good measure of CPU performance, as long
> > > > > > as you've got a properly accelerated DRM graphics pipeline. More
> > > > > > likely 2D/3D and memory.
> > > > >
> > > > > indeed.
> > > > > For quantified look i’m looking on v.simple approach to estimate real clock is http://uob-hpc.github.io/2017/11/22/arm-clock-freq.html
> > > > > by curiosity i looked what it reports on a53/a55/a72/a76 and it is surprisingly accurate :-)
> > > > > on mine 3576 with collabora uboot+mainline atf is hows 800MHz (and in perf. gov it seems to be constant)
> > > > >
> > > > > >
> > > > > > There might be some difference in how PVTPLL behaves on RK3576 vs.
> > > > > > RK3588. But frankly first I would check if you are using comparable
> > > > > > ATF implementations (e.g. upstream TF-A in both cases), kernels and
> > > > > > thermal environment :)
> > > > >
> > > > > all tests: the same 6.15.2 mainline + some collabora patches
> > > > >
> > > > > diffs were:
> > > > > 1.collabora uboot[1] + mainline atf 2.13
> > > > > 2.collabora uboot[1] + rockchip rkbin bl31 blob
> > > > > 3.vendor uboot (bin dump from friendlyelec ubuntu image)
> > > > >
> > > > > on 1/2 i see kind of issue with clock values (i.e. perf gov gives constant 800MHz on mainline atf).
> > > > > 3 seems to perform better - (i.e. perf gov gives constant 1500MHz so all is snappier/faster)
> > > >
> > > > There is indeed something weird going on. I've tried running sbc-bench
> > > > [1], and even though I observe dynamically varying CPU frequencies
> > > > after boot with schedutil governor, once sbc-bench switches the
> > > > governor to "performance" and goes through the OPPs in descending
> > > > frequency order, the CPUs seem to get stuck at the last applied low
> > > > frequency. Even after max frequency gets reverted from 408 MHz to
> > > > something higher, even after I switch the governor to something else -
> > > > no matter what. Only a reboot gets the higher frequencies 'unstuck'
> > > > for me.
> > > >
> > > > These are all observed at around 55C SoC temperature, so throttling is
> > > > not an issue. Regulators are stuck at 950000 uV - way above 700000 uV
> > > > that the 408 MHz OPP requires (and power readings seem to match: I'm
> > > > getting about 2.3W consumption at 408 MHz in idle vs. normal idle
> > > > reading of 1.4W at around 1 GHz).
> > > >
> > > > Not sure what's going on here, and I don't remember seeing anything
> > > > similar on RK3588. Thoughts welcome.
> > >
> > > This may once again be a "accidentally uses wrong clock IDs" type
> > > situation. The other possibility is that we're getting confused
> > > between what we think the clock rate is and what SCMI actually set
> > > the clock rate to.
> > >
> > > Things to check is whether the right clock controller (scmi vs cru)
> > > and the right clock id (check ATF source for this) is used.
> >
> > Clock IDs in the kernel seem to match those in ATF, but I've noticed
> > what appears to be a buffer overflow in some of the SCMI clock names
> > defined in the opensource TF-A (thanks GCC 15 and its zealous
> > warnings):
>
> After some more testing, I tend to confirm what Piotr observed
> earlier. Namely, frequency scaling acts weird on any ATF version (be
> it binary BL31 or opensource TF-A), as long as mainline u-boot is
> used. Using the u-boot binary extracted from the ArmSoM QWRT image
> does not lead to "stuck" CPU frequencies when running sbc-bench.
>
> I'm getting this with the exact same kernel build (6.16-rc1 with some
> Sige5 related patches, namely v2 of this series, Nicolas' USB
> enablement series and TSADC). The only other difference is that the
> binary u-boot doesn't have EFI support, so I had to boot into the
> ARM64 uncompressed Image instead of vmlinuz.efi, but those were both
> taken from the same build.
>
> What I'm observing during the sbc-bench run:
> - It switches the cpufreq governor from schedutil to performance
> - It goes through all CPU OPPs in descending frequency order
> --- While it does that when booted using mainline u-boot +
> vmlinuz.efi: "hardware limits" line in "cpupower -c 0,4
> frequency-info" changes with each OPP change (the max frequency
> getting reduced sequentially), then it resets to the initial full
> range, but the actual frequency stays stuck at the lowest possible
> value
> --- While it does that when booted using binary u-boot + Image:
> "hardware limits" line in "cpupower -c 0,4 frequency-info" doesn't
> change, but the actual frequency gets reduced sequentially. Then after
> the iteration over all OPPs is completed it returns to the highest
> possible value, and adjusts dynamically based on thermal throttling as
> the benchmark progresses

Slight correction: it's not the "hardware limits" line, but rather
"current policy".

Note that booting mainline u-boot in non-EFI mode (using plain Image)
doesn't change the results above.

> So it seems that it's not really linked to SCMI clocks or PVTPLL per
> se, but rather what the binary u-boot configures differently vs.
> mainline before the kernel takes over, or something in other firmware
> services that the binary u-boot provides (?)
>
> I'm wondering if there is any clock related functionality in the
> OP-TEE? I didn't have any OP-TEE image in my mainline u-boot builds
> (frankly, I don't even know where to grab one), but the binary u-boot
> from ArmSoM advertises the following:
>
> I/TC: OP-TEE version: 3.13.0-791-g185dc3c92 #hisping.lin (gcc version
> 10.2.1 20201103 (GNU Toolchain for the A-profile Architecture
> 10.2-2020.11 (arm-10.16))) #2 Tue Apr 16 11:05:25 CST 2024 aarch64,
> fwver: v1.01

FTR: I've tried to bluntly wrap rk3576_bl32_v1.05.bin into an ELF file
using the following linker script and feed it as $TEE to the mainline
u-boot build, but the resulting u-boot-rockchip.bin gets stuck at boot
after checking hashes of ATF images, so I'm still lost as to how one
can check if the OP-TEE has any influence on the cpufreq behavior.

---
ENTRY(_binary_rk3576_bl32_v1_05_bin_start);

SECTIONS
{
. = 0x08400000;
.data : {
*(.data)
}
}
---

0x08400000 is the addr listed for BL32 in RK3576TRUST.ini in rkbin.

Best regards,
Alexey