Re: [PATCH v4 1/2] sched/topology: improve topology_span_sane speed

From: K Prateek Nayak
Date: Thu Jun 12 2025 - 05:30:51 EST


Hello Leon,

Thank you for more info!

On 6/12/2025 1:11 PM, Leon Romanovsky wrote:
[ 0.032188] CPU topo: Max. logical packages: 10
[ 0.032189] CPU topo: Max. logical dies: 10
[ 0.032189] CPU topo: Max. dies per package: 1
[ 0.032194] CPU topo: Max. threads per core: 1
[ 0.032194] CPU topo: Num. cores per package: 1
[ 0.032195] CPU topo: Num. threads per package: 1
[ 0.032195] CPU topo: Allowing 10 present CPUs plus 0 hotplug CPUs

This indicates each CPU is a socket leading to 10 sockets ...

[ 0.288498] smp: Bringing up secondary CPUs ...
[ 0.289225] smpboot: x86: Booting SMP configuration:
[ 0.289900] .... node #0, CPUs: #1
[ 0.290511] .... node #1, CPUs: #2 #3
[ 0.291559] .... node #2, CPUs: #4 #5
[ 0.292557] .... node #3, CPUs: #6 #7
[ 0.293593] .... node #4, CPUs: #8 #9
[ 0.326310] smp: Brought up 5 nodes, 10 CPUs

... and this indicates two sockets are grouped as one NUMA node
leading to 5 nodes in total. I tried the following:

qemu-system-x86_64 -enable-kvm \
-cpu EPYC-Milan-v2 -m 20G -smp cpus=10,sockets=10 \
-machine q35 \
-object memory-backend-ram,size=4G,id=m0 \
-object memory-backend-ram,size=4G,id=m1 \
-object memory-backend-ram,size=4G,id=m2 \
-object memory-backend-ram,size=4G,id=m3 \
-object memory-backend-ram,size=4G,id=m4 \
-numa node,cpus=0-1,memdev=m0,nodeid=0 \
-numa node,cpus=2-3,memdev=m1,nodeid=1 \
-numa node,cpus=4-5,memdev=m2,nodeid=2 \
-numa node,cpus=6-7,memdev=m3,nodeid=3 \
-numa node,cpus=8-9,memdev=m4,nodeid=4 \
...

but could not hit this issue with v6.16-rc1 kernel and QEMU emulator
version 10.0.50 (v10.0.0-1610-gd9ce74873a)

[ 0.327532] smpboot: Total of 10 processors activated (51878.08 BogoMIPS)
[ 0.329252] ------------[ cut here ]------------
[ 0.329252] WARNING: CPU: 0 PID: 1 at kernel/sched/topology.c:2486 build_sched_domains+0xe67/0x13a0
[ 0.330608] Modules linked in:
[ 0.331050] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.16.0-rc1_for_upstream_min_debug_2025_06_09_14_44 #1 NONE
[ 0.332386] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[ 0.333767] RIP: 0010:build_sched_domains+0xe67/0x13a0
[ 0.334298] Code: ff ff 8b 6c 24 08 48 8b 44 24 68 65 48 2b 05 60 24 d0 01 0f 85 03 05 00 00 48 83 c4 70 89 e8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 <0f> 0b e9 65 fe ff ff 48 c7 c7 28 fb 08 82 4c 89 44 24 28 c6 05 e4
[ 0.336635] RSP: 0000:ffff8881002efe30 EFLAGS: 00010202
[ 0.337326] RAX: 00000000ffffff01 RBX: 0000000000000002 RCX: 00000000ffffff01
[ 0.338234] RDX: 00000000fffffff6 RSI: 0000000000000300 RDI: ffff888100047168
[ 0.338523] RBP: 0000000000000000 R08: ffff888100047168 R09: 0000000000000000
[ 0.339425] R10: ffffffff830dee80 R11: 0000000000000000 R12: ffff888100047168
[ 0.340323] R13: 0000000000000002 R14: ffff888100193480 R15: ffff888380030f40
[ 0.341221] FS: 0000000000000000(0000) GS:ffff8881b9b76000(0000) knlGS:0000000000000000
[ 0.342298] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 0.343096] CR2: ffff88843ffff000 CR3: 000000000282c001 CR4: 0000000000370eb0
[ 0.344042] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 0.344927] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 0.345811] Call Trace:
[ 0.346191] <TASK>
[ 0.346429] sched_init_smp+0x32/0xa0
[ 0.346944] ? stop_machine+0x2c/0x40
[ 0.347460] kernel_init_freeable+0xf5/0x260
[ 0.348031] ? rest_init+0xc0/0xc0
[ 0.348513] kernel_init+0x16/0x120
[ 0.349008] ret_from_fork+0x5e/0xd0
[ 0.349510] ? rest_init+0xc0/0xc0
[ 0.349998] ret_from_fork_asm+0x11/0x20
[ 0.350464] </TASK>
[ 0.350812] ---[ end trace 0000000000000000 ]---

Ah! Since this happens so early topology isn't created yet for
the debug prints to hit! Is it possible to get a dmesg with
"ignore_loglevel" and "sched_verbose" on an older kernel that
did not throw this error on the same host?



Even the qemu cmdline for the guest can help! We can try reproducing
it at our end then. Thank you for all the help.

It is custom QEMU with limited access to hypervisor. This crash is
inside VM.

Noted! Thank a ton for all the data provided.

--
Thanks and Regards,
Prateek