[BUG] New arm scmi check in linux-next causing rk3568 not to boot due to firmware bug

From: Nicolas Frattaroli
Date: Wed May 04 2022 - 08:49:14 EST


Good day,

a user on the #linux-rockchip channel on the Libera.chat IRC network
reported that their RK3568 was no longer getting a CPU and GPU clock
from scmi and consequently not booting when using linux-next. This
was bisected down to the following commit:

https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/drivers/firmware/arm_scmi/base.c?h=next-20220503&id=3b0041f6e10e5bdbb646d98172be43e88734ed62

The error message in the log is as follows:

arm-scmi firmware:scmi: Malformed reply - real_sz:8 calc_sz:4, t->rx.len is 12, sizeof(u32) is 4, loop_num_ret is 3

The rockchip firmware (bl31) being used was v1.32, from here:

https://github.com/JeffyCN/rockchip_mirrors/blob/rkbin/bin/rk35/rk3568_bl31_v1.32.elf

This seems like a non-fatal firmware bug, for which a kernel workaround is
certainly possible, but it would be good if rockchip could fix this in their
firmware.

The user going by "amazingfate" reported that commenting out the
ret = -EPROTO; break;
fixes the issue for them.

I'm writing here to get the discussion started on how we can resolve this
before the Linux 5.19 release.

Sudeep Holla has already told me they'll gladly add a workaround before
the 5.19 release, but would rather see this fixed in the vendor firmware
first. Would rockchip be able and willing to fix it and publish a new
bl31 for rk3568?

Regards,
Nicolas Frattaroli