On Tue, Jul 15, 2025 at 06:04:00PM +0000, Smita Koralahalli wrote:
This series introduces the ability to manage SOFT RESERVED iomem
resources, enabling the CXL driver to remove any portions that
intersect with created CXL regions.
Hi Smita,
This set applied cleanly to todays cxl-next but fails like appended
before region probe.
BTW - there were sparse warnings in the build that look related:
CHECK drivers/dax/hmem/hmem_notify.c
drivers/dax/hmem/hmem_notify.c:10:6: warning: context imbalance in 'hmem_register_fallback_handler' - wrong count at exit
drivers/dax/hmem/hmem_notify.c:24:9: warning: context imbalance in 'hmem_fallback_register_device' - wrong count at exit
This isn't all the logs, I trimmed. Let me know if you need more or
other info to reproduce.
[ 53.652454] cxl_acpi:cxl_softreserv_mem_work_fn:888: Timeout waiting for cxl_mem probing
[ 53.653293] BUG: sleeping function called from invalid context at ./include/linux/sched/mm.h:321
[ 53.653513] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1875, name: kworker/46:1
[ 53.653540] preempt_count: 1, expected: 0
[ 53.653554] RCU nest depth: 0, expected: 0
[ 53.653568] 3 locks held by kworker/46:1/1875:
[ 53.653569] #0: ff37d78240041548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x578/0x630
[ 53.653583] #1: ff6b0385dedf3e38 (cxl_sr_work){+.+.}-{0:0}, at: process_one_work+0x1bd/0x630
[ 53.653589] #2: ffffffffb33476d8 (hmem_notify_lock){+.+.}-{3:3}, at: hmem_fallback_register_device+0x23/0x60
[ 53.653598] Preemption disabled at:
[ 53.653599] [<ffffffffb1e23993>] hmem_fallback_register_device+0x23/0x60
[ 53.653640] CPU: 46 UID: 0 PID: 1875 Comm: kworker/46:1 Not tainted 6.16.0CXL-NEXT-ALISON-SR-V5+ #5 PREEMPT(voluntary)
[ 53.653643] Workqueue: events cxl_softreserv_mem_work_fn [cxl_acpi]
[ 53.653648] Call Trace:
[ 53.653649] <TASK>
[ 53.653652] dump_stack_lvl+0xa8/0xd0
[ 53.653658] dump_stack+0x14/0x20
[ 53.653659] __might_resched+0x1ae/0x2d0
[ 53.653666] __might_sleep+0x48/0x70
[ 53.653668] __kmalloc_node_track_caller_noprof+0x349/0x510
[ 53.653674] ? __devm_add_action+0x3d/0x160
[ 53.653685] ? __pfx_devm_action_release+0x10/0x10
[ 53.653688] __devres_alloc_node+0x4a/0x90
[ 53.653689] ? __devres_alloc_node+0x4a/0x90
[ 53.653691] ? __pfx_release_memregion+0x10/0x10 [dax_hmem]
[ 53.653693] __devm_add_action+0x3d/0x160
[ 53.653696] hmem_register_device+0xea/0x230 [dax_hmem]
[ 53.653700] hmem_fallback_register_device+0x37/0x60
[ 53.653703] cxl_softreserv_mem_register+0x24/0x30 [cxl_core]
[ 53.653739] walk_iomem_res_desc+0x55/0xb0
[ 53.653744] ? __pfx_cxl_softreserv_mem_register+0x10/0x10 [cxl_core]
[ 53.653755] cxl_region_softreserv_update+0x46/0x50 [cxl_core]
[ 53.653761] cxl_softreserv_mem_work_fn+0x4a/0x110 [cxl_acpi]
[ 53.653763] ? __pfx_autoremove_wake_function+0x10/0x10
[ 53.653768] process_one_work+0x1fa/0x630
[ 53.653774] worker_thread+0x1b2/0x360
[ 53.653777] kthread+0x128/0x250
[ 53.653781] ? __pfx_worker_thread+0x10/0x10
[ 53.653784] ? __pfx_kthread+0x10/0x10
[ 53.653786] ret_from_fork+0x139/0x1e0
[ 53.653790] ? __pfx_kthread+0x10/0x10
[ 53.653792] ret_from_fork_asm+0x1a/0x30
[ 53.653801] </TASK>
[ 53.654193] =============================
[ 53.654203] [ BUG: Invalid wait context ]
[ 53.654451] 6.16.0CXL-NEXT-ALISON-SR-V5+ #5 Tainted: G W
[ 53.654623] -----------------------------
[ 53.654785] kworker/46:1/1875 is trying to lock:
[ 53.654946] ff37d7824096d588 (&root->kernfs_rwsem){++++}-{4:4}, at: kernfs_add_one+0x34/0x390
[ 53.655115] other info that might help us debug this:
[ 53.655273] context-{5:5}
[ 53.655428] 3 locks held by kworker/46:1/1875:
[ 53.655579] #0: ff37d78240041548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x578/0x630
[ 53.655739] #1: ff6b0385dedf3e38 (cxl_sr_work){+.+.}-{0:0}, at: process_one_work+0x1bd/0x630
[ 53.655900] #2: ffffffffb33476d8 (hmem_notify_lock){+.+.}-{3:3}, at: hmem_fallback_register_device+0x23/0x60
[ 53.656062] stack backtrace:
[ 53.656224] CPU: 46 UID: 0 PID: 1875 Comm: kworker/46:1 Tainted: G W 6.16.0CXL-NEXT-ALISON-SR-V5+ #5 PREEMPT(voluntary)
[ 53.656227] Tainted: [W]=WARN
[ 53.656228] Workqueue: events cxl_softreserv_mem_work_fn [cxl_acpi]
[ 53.656232] Call Trace:
[ 53.656232] <TASK>
[ 53.656234] dump_stack_lvl+0x85/0xd0
[ 53.656238] dump_stack+0x14/0x20
[ 53.656239] __lock_acquire+0xaf4/0x2200
[ 53.656246] lock_acquire+0xd8/0x300
[ 53.656248] ? kernfs_add_one+0x34/0x390
[ 53.656252] ? __might_resched+0x208/0x2d0
[ 53.656257] down_write+0x44/0xe0
[ 53.656262] ? kernfs_add_one+0x34/0x390
[ 53.656263] kernfs_add_one+0x34/0x390
[ 53.656265] kernfs_create_dir_ns+0x5a/0xa0
[ 53.656268] sysfs_create_dir_ns+0x74/0xd0
[ 53.656270] kobject_add_internal+0xb1/0x2f0
[ 53.656273] kobject_add+0x7d/0xf0
[ 53.656275] ? get_device_parent+0x28/0x1e0
[ 53.656280] ? __pfx_klist_children_get+0x10/0x10
[ 53.656282] device_add+0x124/0x8b0
[ 53.656285] ? dev_set_name+0x56/0x70
[ 53.656287] platform_device_add+0x102/0x260
[ 53.656289] hmem_register_device+0x160/0x230 [dax_hmem]
[ 53.656291] hmem_fallback_register_device+0x37/0x60
[ 53.656294] cxl_softreserv_mem_register+0x24/0x30 [cxl_core]
[ 53.656323] walk_iomem_res_desc+0x55/0xb0
[ 53.656326] ? __pfx_cxl_softreserv_mem_register+0x10/0x10 [cxl_core]
[ 53.656335] cxl_region_softreserv_update+0x46/0x50 [cxl_core]
[ 53.656342] cxl_softreserv_mem_work_fn+0x4a/0x110 [cxl_acpi]
[ 53.656343] ? __pfx_autoremove_wake_function+0x10/0x10
[ 53.656346] process_one_work+0x1fa/0x630
[ 53.656350] worker_thread+0x1b2/0x360
[ 53.656352] kthread+0x128/0x250
[ 53.656354] ? __pfx_worker_thread+0x10/0x10
[ 53.656356] ? __pfx_kthread+0x10/0x10
[ 53.656357] ret_from_fork+0x139/0x1e0
[ 53.656360] ? __pfx_kthread+0x10/0x10
[ 53.656361] ret_from_fork_asm+0x1a/0x30
[ 53.656366] </TASK>
[ 53.662274] BUG: scheduling while atomic: kworker/46:1/1875/0x00000002
[ 53.663552] schedule+0x4a/0x160
[ 53.663553] schedule_timeout+0x10a/0x120
[ 53.663555] ? debug_smp_processor_id+0x1b/0x30
[ 53.663556] ? trace_hardirqs_on+0x5f/0xd0
[ 53.663558] __wait_for_common+0xb9/0x1c0
[ 53.663559] ? __pfx_schedule_timeout+0x10/0x10
[ 53.663561] wait_for_completion+0x28/0x30
[ 53.663562] __synchronize_srcu+0xbf/0x180
[ 53.663566] ? __pfx_wakeme_after_rcu+0x10/0x10
[ 53.663571] ? i2c_repstart+0x30/0x80
[ 53.663576] synchronize_srcu+0x46/0x120
[ 53.663577] kill_dax+0x47/0x70
[ 53.663580] __devm_create_dev_dax+0x112/0x470
[ 53.663582] devm_create_dev_dax+0x26/0x50
[ 53.663584] dax_hmem_probe+0x87/0xd0 [dax_hmem]
[ 53.663585] platform_probe+0x61/0xd0
[ 53.663589] really_probe+0xe2/0x390
[ 53.663591] ? __pfx___device_attach_driver+0x10/0x10
[ 53.663593] __driver_probe_device+0x7e/0x160
[ 53.663594] driver_probe_device+0x23/0xa0
[ 53.663596] __device_attach_driver+0x92/0x120
[ 53.663597] bus_for_each_drv+0x8c/0xf0
[ 53.663599] __device_attach+0xc2/0x1f0
[ 53.663601] device_initial_probe+0x17/0x20
[ 53.663603] bus_probe_device+0xa8/0xb0
[ 53.663604] device_add+0x687/0x8b0
[ 53.663607] ? dev_set_name+0x56/0x70
[ 53.663609] platform_device_add+0x102/0x260
[ 53.663610] hmem_register_device+0x160/0x230 [dax_hmem]
[ 53.663612] hmem_fallback_register_device+0x37/0x60
[ 53.663614] cxl_softreserv_mem_register+0x24/0x30 [cxl_core]
[ 53.663637] walk_iomem_res_desc+0x55/0xb0
[ 53.663640] ? __pfx_cxl_softreserv_mem_register+0x10/0x10 [cxl_core]
[ 53.663647] cxl_region_softreserv_update+0x46/0x50 [cxl_core]
[ 53.663654] cxl_softreserv_mem_work_fn+0x4a/0x110 [cxl_acpi]
[ 53.663655] ? __pfx_autoremove_wake_function+0x10/0x10
[ 53.663658] process_one_work+0x1fa/0x630
[ 53.663662] worker_thread+0x1b2/0x360
[ 53.663664] kthread+0x128/0x250
[ 53.663666] ? __pfx_worker_thread+0x10/0x10
[ 53.663668] ? __pfx_kthread+0x10/0x10
[ 53.663670] ret_from_fork+0x139/0x1e0
[ 53.663672] ? __pfx_kthread+0x10/0x10
[ 53.663673] ret_from_fork_asm+0x1a/0x30
[ 53.663677] </TASK>
[ 53.700107] BUG: scheduling while atomic: kworker/46:1/1875/0x00000002
[ 53.700264] INFO: lockdep is turned off.
[ 53.701315] Preemption disabled at:
[ 53.701316] [<ffffffffb1e23993>] hmem_fallback_register_device+0x23/0x60
[ 53.701631] CPU: 46 UID: 0 PID: 1875 Comm: kworker/46:1 Tainted: G W 6.16.0CXL-NEXT-ALISON-SR-V5+ #5 PREEMPT(voluntary)
[ 53.701633] Tainted: [W]=WARN
[ 53.701635] Workqueue: events cxl_softreserv_mem_work_fn [cxl_acpi]
[ 53.701638] Call Trace:
[ 53.701638] <TASK>
[ 53.701640] dump_stack_lvl+0xa8/0xd0
[ 53.701644] dump_stack+0x14/0x20
[ 53.701645] __schedule_bug+0xa2/0xd0
[ 53.701649] __schedule+0xe6f/0x10d0
[ 53.701652] ? debug_smp_processor_id+0x1b/0x30
[ 53.701655] ? lock_release+0x1e6/0x2b0
[ 53.701658] ? trace_hardirqs_on+0x5f/0xd0
[ 53.701661] schedule+0x4a/0x160
[ 53.701662] schedule_timeout+0x10a/0x120
[ 53.701664] ? debug_smp_processor_id+0x1b/0x30
[ 53.701666] ? trace_hardirqs_on+0x5f/0xd0
[ 53.701667] __wait_for_common+0xb9/0x1c0
[ 53.701668] ? __pfx_schedule_timeout+0x10/0x10
[ 53.701670] wait_for_completion+0x28/0x30
[ 53.701671] __synchronize_srcu+0xbf/0x180
[ 53.701677] ? __pfx_wakeme_after_rcu+0x10/0x10
[ 53.701682] ? i2c_repstart+0x30/0x80
[ 53.701685] synchronize_srcu+0x46/0x120
[ 53.701687] kill_dax+0x47/0x70
[ 53.701689] __devm_create_dev_dax+0x112/0x470
[ 53.701691] devm_create_dev_dax+0x26/0x50
[ 53.701693] dax_hmem_probe+0x87/0xd0 [dax_hmem]
[ 53.701695] platform_probe+0x61/0xd0
[ 53.701698] really_probe+0xe2/0x390
[ 53.701700] ? __pfx___device_attach_driver+0x10/0x10
[ 53.701701] __driver_probe_device+0x7e/0x160
[ 53.701703] driver_probe_device+0x23/0xa0
[ 53.701704] __device_attach_driver+0x92/0x120
[ 53.701706] bus_for_each_drv+0x8c/0xf0
[ 53.701708] __device_attach+0xc2/0x1f0
[ 53.701710] device_initial_probe+0x17/0x20
[ 53.701711] bus_probe_device+0xa8/0xb0
[ 53.701712] device_add+0x687/0x8b0
[ 53.701715] ? dev_set_name+0x56/0x70
[ 53.701717] platform_device_add+0x102/0x260
[ 53.701718] hmem_register_device+0x160/0x230 [dax_hmem]
[ 53.701720] hmem_fallback_register_device+0x37/0x60
[ 53.701722] cxl_softreserv_mem_register+0x24/0x30 [cxl_core]
[ 53.701734] walk_iomem_res_desc+0x55/0xb0
[ 53.701738] ? __pfx_cxl_softreserv_mem_register+0x10/0x10 [cxl_core]
[ 53.701745] cxl_region_softreserv_update+0x46/0x50 [cxl_core]
[ 53.701751] cxl_softreserv_mem_work_fn+0x4a/0x110 [cxl_acpi]
[ 53.701752] ? __pfx_autoremove_wake_function+0x10/0x10
[ 53.701756] process_one_work+0x1fa/0x630
[ 53.701760] worker_thread+0x1b2/0x360
[ 53.701762] kthread+0x128/0x250
[ 53.701765] ? __pfx_worker_thread+0x10/0x10
[ 53.701766] ? __pfx_kthread+0x10/0x10
[ 53.701768] ret_from_fork+0x139/0x1e0
[ 53.701771] ? __pfx_kthread+0x10/0x10
[ 53.701772] ret_from_fork_asm+0x1a/0x30
[ 53.701777] </TASK>