Re: arm64: csdlock at early boot due to slow serial (?)

From: Breno Leitao
Date: Thu Jul 03 2025 - 11:05:30 EST


Hello Leo,

On Thu, Jul 03, 2025 at 12:45:57PM +0100, Leo Yan wrote:
> On Thu, Jul 03, 2025 at 03:02:39AM -0700, Breno Leitao wrote:
> > > In some cases, if normal world and secure world share the same UART
> > > port, it can cause the UART state machine malfunction and long wait.
> >
> > I don't know how to check it for sure, but, looking at the serial
> > console output, I don't see anything else using the UART. The only
> > output I see on the console at that time is coming from linux kernel.
> >
> > Would you recommend any additional check?
>
> I have no experience in the driver, I should avoid any noise. But two
> things in my head for quick trying.
>
> - First, you could try earlycon mode, e.g., add option in kernel
> command line:
>
> earlycon=pl011,mmio32,0xc280000
> earlycon=pl011,0xc280000
>
> This would be possible to give more printing logs. If earlycon works
> well, then the issue might be caused by later init code (clock, or
> UART driver itself).

That is an interesting experiment, thanks for the heads-up.

I got `earlycon=pl011,mmio32,0xc280000` working, the machine boots fine,
and I see different behaviours.

Booting up the system the following lines (as examples):

[ 0.000000] earlycon: pl11 at MMIO32 0x000000000c280000 (options '115200')
[ 0.000000] printk: legacy bootconsole [pl11] enabled
[ 0.493205] smp: Brought up 1 node, 72 CPUs
[ 4.232864] SMP: Total of 72 processors activated.
[ 4.242788] CPU: All CPU(s) started at EL2
[ 6.318137] ARMH0011:00: ttyAMA0 at MMIO 0xc280000 (irq = 66, base_baud = 0) is a SBSA
[ 6.978245] netpoll: netconsole: local port 6666
[ 7.234429] Run /init as init process

Also, I have a trace_printk() in pl011_console_write_atomic, and it only
start later in the code. More precisely at:

swapper/0-1 [000] dN.1. 6.334581: pl011_console_write_atomic: Printing 49 [ 6.334573] printk: console [ttyAMA0] enabled

I understand that the early boot messages got sent by some other
function? (Or maybe the tracing framework was not initialized?)

Anyway, pl011_console_write_atomic() continues to call ~20K times cpu_relax()
waiting for the BUSY bit to disappear, as before.

For comparison, these are the same line as above, when earlycon is not
set:

[ 0.154009] smp: Brought up 1 node, 72 CPUs
[ 0.154012] SMP: Total of 72 processors activated.
[ 0.154013] CPU: All CPU(s) started at EL2
[ 0.466949] ARMH0011:00: ttyAMA0 at MMIO 0xc280000 (irq = 66, base_baud = 0) is a SBSA
[ 0.466967] printk: console [ttyAMA0] enabled
[ 0.466949] ARMH0011:00: ttyAMA0 at MMIO 0xc280000 (irq = 66, base_baud = 0) is a SBSA
[ 5.513307] smp: csd: Detected non-responsive CSD lock (#1) on CPU#0, waiting 5001000000 ns for CPU#05 do_nothing+0x0/0x10(0x0).
[ 10.719305] netpoll: netconsole: local port 6666
[ 10.961798] Run /init as init process


>From what I see, using earlycon boots the kernel faster. Here are some
ideas I got:

1) Earlycon enables the message to be printed earlier, thus, it smooth
out the serial console

2) without earlycon, the messages are printed at once, with interrupt
disabled
2.a) This causes CSD locks
2.b) CSD locks caused an additional 3 seconds delay

3) pl011_console_write_atomic is slow on both cases

Thanks for the good suggestion,
--breno